Overview

Brought to you by YData

Dataset statistics

Number of variables149
Number of observations1926393
Missing cells160886057
Missing cells (%)56.1%
Total size in memory2.1 GiB
Average record size in memory1.2 KiB

Variable types

Text149

Dataset

DescriptionInvertebrate Zoology NMNH Extant Specimen Records 0052489-241126133413365
URLhttps://doi.org/10.15468/dl.fya67r

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "IZ" Constant
datasetName has constant value "NMNH Extant Biology" Constant
materialSampleID has constant value "NORTH_AMERICA" Constant
eventID has constant value "North Pacific Ocean, Gulf Of California" Constant
samplingEffort has constant value "24.1667" Constant
fieldNotes has constant value "-110.283" Constant
georeferencedDate has constant value "8" Constant
latestEonOrHighestEonothem has constant value "US" Constant
earliestEraOrLowestErathem has constant value "Idaho" Constant
earliestAgeOrLowestStage has constant value "NORTH_AMERICA" Constant
latestAgeOrHighestStage has constant value "North Pacific Ocean, Departure Bay" Constant
bed has constant value "Moultrie" Constant
identificationRemarks has constant value "-83.7685" Constant
acceptedNameUsage has constant value "SPECIES" Constant
parentNameUsage has constant value "GEOLocate" Constant
namePublishedIn has constant value "ACCEPTED" Constant
subgenus has constant value "false" Constant
cultivarEpithet has constant value "108" Constant
protocol has constant value "EML" Constant
relativeOrganismQuantity has constant value "821cc27a-e3bb-4bc5-ac34-89ada245069d" Constant
recordNumber has 1804640 (93.7%) missing values Missing
recordedBy has 764111 (39.7%) missing values Missing
sex has 1802980 (93.6%) missing values Missing
lifeStage has 1888856 (98.1%) missing values Missing
disposition has 1926391 (> 99.9%) missing values Missing
associatedOccurrences has 1926391 (> 99.9%) missing values Missing
associatedReferences has 1926391 (> 99.9%) missing values Missing
associatedSequences has 1921269 (99.7%) missing values Missing
associatedTaxa has 1926391 (> 99.9%) missing values Missing
occurrenceRemarks has 1144485 (59.4%) missing values Missing
verbatimLabel has 1926391 (> 99.9%) missing values Missing
materialSampleID has 1926391 (> 99.9%) missing values Missing
eventID has 1926392 (> 99.9%) missing values Missing
fieldNumber has 1339759 (69.5%) missing values Missing
eventDate has 688611 (35.7%) missing values Missing
startDayOfYear has 842313 (43.7%) missing values Missing
endDayOfYear has 842311 (43.7%) missing values Missing
year has 689273 (35.8%) missing values Missing
month has 800939 (41.6%) missing values Missing
day has 887053 (46.0%) missing values Missing
verbatimEventDate has 1173199 (60.9%) missing values Missing
habitat has 1857136 (96.4%) missing values Missing
samplingEffort has 1926392 (> 99.9%) missing values Missing
fieldNotes has 1926392 (> 99.9%) missing values Missing
locationID has 984066 (51.1%) missing values Missing
higherGeography has 67831 (3.5%) missing values Missing
continent has 1027391 (53.3%) missing values Missing
waterBody has 666651 (34.6%) missing values Missing
islandGroup has 1925623 (> 99.9%) missing values Missing
island has 1925415 (99.9%) missing values Missing
countryCode has 110759 (5.7%) missing values Missing
stateProvince has 943673 (49.0%) missing values Missing
county has 1786420 (92.7%) missing values Missing
locality has 642386 (33.3%) missing values Missing
verbatimElevation has 1925931 (> 99.9%) missing values Missing
verbatimDepth has 1900149 (98.6%) missing values Missing
decimalLatitude has 927346 (48.1%) missing values Missing
decimalLongitude has 927346 (48.1%) missing values Missing
verbatimCoordinateSystem has 1246885 (64.7%) missing values Missing
verbatimSRS has 1926391 (> 99.9%) missing values Missing
footprintSRS has 1926391 (> 99.9%) missing values Missing
footprintSpatialFit has 1926391 (> 99.9%) missing values Missing
georeferencedBy has 1926391 (> 99.9%) missing values Missing
georeferencedDate has 1926391 (> 99.9%) missing values Missing
georeferenceProtocol has 1265790 (65.7%) missing values Missing
georeferenceSources has 1926390 (> 99.9%) missing values Missing
georeferenceRemarks has 1896105 (98.4%) missing values Missing
latestEonOrHighestEonothem has 1926392 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 1926392 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 1926391 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 1926390 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 1926390 (> 99.9%) missing values Missing
latestAgeOrHighestStage has 1926392 (> 99.9%) missing values Missing
lithostratigraphicTerms has 1926388 (> 99.9%) missing values Missing
group has 1926391 (> 99.9%) missing values Missing
bed has 1926392 (> 99.9%) missing values Missing
identificationQualifier has 1908260 (99.1%) missing values Missing
typeStatus has 1841066 (95.6%) missing values Missing
identifiedBy has 1085208 (56.3%) missing values Missing
identifiedByID has 1926391 (> 99.9%) missing values Missing
dateIdentified has 1926391 (> 99.9%) missing values Missing
identificationVerificationStatus has 1926390 (> 99.9%) missing values Missing
identificationRemarks has 1926392 (> 99.9%) missing values Missing
parentNameUsageID has 1926391 (> 99.9%) missing values Missing
namePublishedInID has 1926391 (> 99.9%) missing values Missing
acceptedNameUsage has 1926391 (> 99.9%) missing values Missing
parentNameUsage has 1926392 (> 99.9%) missing values Missing
namePublishedIn has 1926391 (> 99.9%) missing values Missing
class has 66157 (3.4%) missing values Missing
order has 329537 (17.1%) missing values Missing
family has 144488 (7.5%) missing values Missing
subtribe has 1926391 (> 99.9%) missing values Missing
genus has 358044 (18.6%) missing values Missing
genericName has 358043 (18.6%) missing values Missing
subgenus has 1926391 (> 99.9%) missing values Missing
infragenericEpithet has 1926391 (> 99.9%) missing values Missing
specificEpithet has 626798 (32.5%) missing values Missing
infraspecificEpithet has 1890289 (98.1%) missing values Missing
cultivarEpithet has 1926391 (> 99.9%) missing values Missing
verbatimTaxonRank has 1926391 (> 99.9%) missing values Missing
vernacularName has 1926391 (> 99.9%) missing values Missing
nomenclaturalCode has 1926389 (> 99.9%) missing values Missing
nomenclaturalStatus has 1926391 (> 99.9%) missing values Missing
taxonRemarks has 1926390 (> 99.9%) missing values Missing
elevation has 1919570 (99.6%) missing values Missing
elevationAccuracy has 1922885 (99.8%) missing values Missing
depth has 1143682 (59.4%) missing values Missing
depthAccuracy has 1205339 (62.6%) missing values Missing
distanceFromCentroidInMeters has 1917545 (99.5%) missing values Missing
mediaType has 1683241 (87.4%) missing values Missing
classKey has 66158 (3.4%) missing values Missing
orderKey has 329533 (17.1%) missing values Missing
familyKey has 144485 (7.5%) missing values Missing
genusKey has 358041 (18.6%) missing values Missing
subgenusKey has 1926388 (> 99.9%) missing values Missing
speciesKey has 626819 (32.5%) missing values Missing
species has 626822 (32.5%) missing values Missing
verbatimScientificName has 353775 (18.4%) missing values Missing
repatriated has 110144 (5.7%) missing values Missing
relativeOrganismQuantity has 1926392 (> 99.9%) missing values Missing
projectId has 1926390 (> 99.9%) missing values Missing
gbifRegion has 115678 (6.0%) missing values Missing
level0Gid has 1691070 (87.8%) missing values Missing
level0Name has 1691070 (87.8%) missing values Missing
level1Gid has 1694638 (88.0%) missing values Missing
level1Name has 1694634 (88.0%) missing values Missing
level2Gid has 1708984 (88.7%) missing values Missing
level2Name has 1709049 (88.7%) missing values Missing
level3Gid has 1886622 (97.9%) missing values Missing
level3Name has 1887342 (98.0%) missing values Missing
iucnRedListCategory has 469562 (24.4%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-08 22:50:57.098131
Analysis finished2025-01-08 22:52:33.162435
Duration1 minute and 36.06 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct1926393
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:34.131197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters19263930
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1926393 ?
Unique (%)100.0%

Sample

1st row1321728981
2nd row1320179422
3rd row1320179575
4th row1321729723
5th row1320179846
ValueCountFrequency (%)
1321728981 1
 
< 0.1%
2565454742 1
 
< 0.1%
1320179846 1
 
< 0.1%
1321730497 1
 
< 0.1%
1320180949 1
 
< 0.1%
1320181165 1
 
< 0.1%
1456364805 1
 
< 0.1%
1320182209 1
 
< 0.1%
1321732097 1
 
< 0.1%
2571470239 1
 
< 0.1%
Other values (1926383) 1926383
> 99.9%
2025-01-08T17:52:35.191822image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3941642
20.5%
3 2930195
15.2%
2 2443917
12.7%
7 1519890
 
7.9%
8 1483841
 
7.7%
0 1476009
 
7.7%
9 1469022
 
7.6%
5 1371397
 
7.1%
6 1317118
 
6.8%
4 1310899
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19263930
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3941642
20.5%
3 2930195
15.2%
2 2443917
12.7%
7 1519890
 
7.9%
8 1483841
 
7.7%
0 1476009
 
7.7%
9 1469022
 
7.6%
5 1371397
 
7.1%
6 1317118
 
6.8%
4 1310899
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 19263930
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3941642
20.5%
3 2930195
15.2%
2 2443917
12.7%
7 1519890
 
7.9%
8 1483841
 
7.7%
0 1476009
 
7.7%
9 1469022
 
7.6%
5 1371397
 
7.1%
6 1317118
 
6.8%
4 1310899
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19263930
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3941642
20.5%
3 2930195
15.2%
2 2443917
12.7%
7 1519890
 
7.9%
8 1483841
 
7.7%
0 1476009
 
7.7%
9 1469022
 
7.6%
5 1371397
 
7.1%
6 1317118
 
6.8%
4 1310899
 
6.8%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:35.245292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters13484751
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 1926393
100.0%
2025-01-08T17:52:35.327520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 3852786
28.6%
0 3852786
28.6%
_ 3852786
28.6%
1 1926393
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5779179
42.9%
Uppercase Letter 3852786
28.6%
Connector Punctuation 3852786
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3852786
66.7%
1 1926393
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 3852786
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3852786
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9631965
71.4%
Latin 3852786
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3852786
40.0%
_ 3852786
40.0%
1 1926393
20.0%
Latin
ValueCountFrequency (%)
C 3852786
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13484751
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 3852786
28.6%
0 3852786
28.6%
_ 3852786
28.6%
1 1926393
14.3%
Distinct113487
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:35.448231image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters38527860
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62369 ?
Unique (%)3.2%

Sample

1st row2021-10-06T15:29:00Z
2nd row2024-09-25T16:08:00Z
3rd row2020-01-06T17:42:00Z
4th row2018-09-17T12:46:00Z
5th row2024-09-25T15:32:00Z
ValueCountFrequency (%)
2024-09-25t13:44:00z 9049
 
0.5%
2024-09-25t13:46:00z 8728
 
0.5%
2024-09-25t17:07:00z 8646
 
0.4%
2024-09-25t17:10:00z 8633
 
0.4%
2024-09-25t17:05:00z 8623
 
0.4%
2024-09-25t13:45:00z 8553
 
0.4%
2024-09-25t17:11:00z 8500
 
0.4%
2024-09-25t17:08:00z 8494
 
0.4%
2024-09-25t15:27:00z 8472
 
0.4%
2024-09-25t17:15:00z 8471
 
0.4%
Other values (113477) 1840224
95.5%
2025-01-08T17:52:35.632504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8971409
23.3%
2 4988502
12.9%
1 4688771
12.2%
- 3852786
10.0%
: 3852786
10.0%
T 1926393
 
5.0%
Z 1926393
 
5.0%
4 1757743
 
4.6%
5 1702088
 
4.4%
9 1536985
 
4.0%
Other values (4) 3324004
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26969502
70.0%
Dash Punctuation 3852786
 
10.0%
Other Punctuation 3852786
 
10.0%
Uppercase Letter 3852786
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8971409
33.3%
2 4988502
18.5%
1 4688771
17.4%
4 1757743
 
6.5%
5 1702088
 
6.3%
9 1536985
 
5.7%
3 1149855
 
4.3%
7 807767
 
3.0%
6 701085
 
2.6%
8 665297
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
T 1926393
50.0%
Z 1926393
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 3852786
100.0%
Other Punctuation
ValueCountFrequency (%)
: 3852786
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 34675074
90.0%
Latin 3852786
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8971409
25.9%
2 4988502
14.4%
1 4688771
13.5%
- 3852786
11.1%
: 3852786
11.1%
4 1757743
 
5.1%
5 1702088
 
4.9%
9 1536985
 
4.4%
3 1149855
 
3.3%
7 807767
 
2.3%
Other values (2) 1366382
 
3.9%
Latin
ValueCountFrequency (%)
T 1926393
50.0%
Z 1926393
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38527860
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8971409
23.3%
2 4988502
12.9%
1 4688771
12.2%
- 3852786
10.0%
: 3852786
10.0%
T 1926393
 
5.0%
Z 1926393
 
5.0%
4 1757743
 
4.6%
5 1702088
 
4.4%
9 1536985
 
4.0%
Other values (4) 3324004
 
8.6%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:35.695109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters113657187
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 1926393
14.3%
museum 1926393
14.3%
of 1926393
14.3%
natural 1926393
14.3%
history 1926393
14.3%
smithsonian 1926393
14.3%
institution 1926393
14.3%
2025-01-08T17:52:35.794640image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 13484751
11.9%
i 11558358
10.2%
11558358
10.2%
a 9631965
 
8.5%
o 9631965
 
8.5%
n 9631965
 
8.5%
s 7705572
 
6.8%
u 7705572
 
6.8%
r 3852786
 
3.4%
m 3852786
 
3.4%
Other values (11) 25043109
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 88614078
78.0%
Space Separator 11558358
 
10.2%
Uppercase Letter 11558358
 
10.2%
Other Punctuation 1926393
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 13484751
15.2%
i 11558358
13.0%
a 9631965
10.9%
o 9631965
10.9%
n 9631965
10.9%
s 7705572
8.7%
u 7705572
8.7%
r 3852786
 
4.3%
m 3852786
 
4.3%
l 3852786
 
4.3%
Other values (4) 7705572
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 3852786
33.3%
M 1926393
16.7%
H 1926393
16.7%
S 1926393
16.7%
I 1926393
16.7%
Space Separator
ValueCountFrequency (%)
11558358
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1926393
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 100172436
88.1%
Common 13484751
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 13484751
13.5%
i 11558358
11.5%
a 9631965
9.6%
o 9631965
9.6%
n 9631965
9.6%
s 7705572
 
7.7%
u 7705572
 
7.7%
r 3852786
 
3.8%
m 3852786
 
3.8%
N 3852786
 
3.8%
Other values (9) 19263930
19.2%
Common
ValueCountFrequency (%)
11558358
85.7%
, 1926393
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 113657187
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 13484751
11.9%
i 11558358
10.2%
11558358
10.2%
a 9631965
 
8.5%
o 9631965
 
8.5%
n 9631965
 
8.5%
s 7705572
 
6.8%
u 7705572
 
6.8%
r 3852786
 
3.4%
m 3852786
 
3.4%
Other values (11) 25043109
22.0%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:35.843928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters55865397
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 1926393
100.0%
2025-01-08T17:52:35.940310image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 7705572
13.8%
: 7705572
13.8%
l 5779179
 
10.3%
i 3852786
 
6.9%
r 3852786
 
6.9%
c 3852786
 
6.9%
g 1926393
 
3.4%
7 1926393
 
3.4%
8 1926393
 
3.4%
4 1926393
 
3.4%
Other values (8) 15411144
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36601467
65.5%
Other Punctuation 9631965
 
17.2%
Decimal Number 9631965
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 7705572
21.1%
l 5779179
15.8%
i 3852786
10.5%
r 3852786
10.5%
c 3852786
10.5%
g 1926393
 
5.3%
u 1926393
 
5.3%
b 1926393
 
5.3%
d 1926393
 
5.3%
s 1926393
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 1926393
20.0%
8 1926393
20.0%
4 1926393
20.0%
3 1926393
20.0%
1 1926393
20.0%
Other Punctuation
ValueCountFrequency (%)
: 7705572
80.0%
. 1926393
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36601467
65.5%
Common 19263930
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 7705572
21.1%
l 5779179
15.8%
i 3852786
10.5%
r 3852786
10.5%
c 3852786
10.5%
g 1926393
 
5.3%
u 1926393
 
5.3%
b 1926393
 
5.3%
d 1926393
 
5.3%
s 1926393
 
5.3%
Common
ValueCountFrequency (%)
: 7705572
40.0%
7 1926393
 
10.0%
8 1926393
 
10.0%
4 1926393
 
10.0%
3 1926393
 
10.0%
. 1926393
 
10.0%
1 1926393
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55865397
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 7705572
13.8%
: 7705572
13.8%
l 5779179
 
10.3%
i 3852786
 
6.9%
r 3852786
 
6.9%
c 3852786
 
6.9%
g 1926393
 
3.4%
7 1926393
 
3.4%
8 1926393
 
3.4%
4 1926393
 
3.4%
Other values (8) 15411144
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:35.993307image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters86687685
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
2nd rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
3rd rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
4th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
5th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
ValueCountFrequency (%)
urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6 1926393
100.0%
2025-01-08T17:52:36.092290image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
d 9631965
11.1%
1 7705572
 
8.9%
- 7705572
 
8.9%
u 5779179
 
6.7%
8 5779179
 
6.7%
2 5779179
 
6.7%
4 5779179
 
6.7%
c 5779179
 
6.7%
f 5779179
 
6.7%
9 3852786
 
4.4%
Other values (9) 23116716
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40454253
46.7%
Decimal Number 34675074
40.0%
Dash Punctuation 7705572
 
8.9%
Other Punctuation 3852786
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 9631965
23.8%
u 5779179
14.3%
c 5779179
14.3%
f 5779179
14.3%
b 3852786
 
9.5%
r 1926393
 
4.8%
i 1926393
 
4.8%
a 1926393
 
4.8%
n 1926393
 
4.8%
e 1926393
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 7705572
22.2%
8 5779179
16.7%
2 5779179
16.7%
4 5779179
16.7%
9 3852786
11.1%
7 3852786
11.1%
6 1926393
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 7705572
100.0%
Other Punctuation
ValueCountFrequency (%)
: 3852786
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 46233432
53.3%
Latin 40454253
46.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 9631965
23.8%
u 5779179
14.3%
c 5779179
14.3%
f 5779179
14.3%
b 3852786
 
9.5%
r 1926393
 
4.8%
i 1926393
 
4.8%
a 1926393
 
4.8%
n 1926393
 
4.8%
e 1926393
 
4.8%
Common
ValueCountFrequency (%)
1 7705572
16.7%
- 7705572
16.7%
8 5779179
12.5%
2 5779179
12.5%
4 5779179
12.5%
9 3852786
8.3%
: 3852786
8.3%
7 3852786
8.3%
6 1926393
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86687685
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 9631965
11.1%
1 7705572
 
8.9%
- 7705572
 
8.9%
u 5779179
 
6.7%
8 5779179
 
6.7%
2 5779179
 
6.7%
4 5779179
 
6.7%
c 5779179
 
6.7%
f 5779179
 
6.7%
9 3852786
 
4.4%
Other values (9) 23116716
26.7%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:36.130774image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters7705572
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 1926393
100.0%
2025-01-08T17:52:36.213115image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 1926393
25.0%
S 1926393
25.0%
N 1926393
25.0%
M 1926393
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7705572
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 1926393
25.0%
S 1926393
25.0%
N 1926393
25.0%
M 1926393
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7705572
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 1926393
25.0%
S 1926393
25.0%
N 1926393
25.0%
M 1926393
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7705572
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 1926393
25.0%
S 1926393
25.0%
N 1926393
25.0%
M 1926393
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:36.251877image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3852786
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIZ
2nd rowIZ
3rd rowIZ
4th rowIZ
5th rowIZ
ValueCountFrequency (%)
iz 1926393
100.0%
2025-01-08T17:52:36.333466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1926393
50.0%
Z 1926393
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3852786
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 1926393
50.0%
Z 1926393
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3852786
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1926393
50.0%
Z 1926393
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3852786
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1926393
50.0%
Z 1926393
50.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:36.372467image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters36601467
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 1926393
33.3%
extant 1926393
33.3%
biology 1926393
33.3%
2025-01-08T17:52:36.461752image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 3852786
 
10.5%
3852786
 
10.5%
t 3852786
 
10.5%
o 3852786
 
10.5%
M 1926393
 
5.3%
H 1926393
 
5.3%
E 1926393
 
5.3%
x 1926393
 
5.3%
a 1926393
 
5.3%
n 1926393
 
5.3%
Other values (5) 9631965
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21190323
57.9%
Uppercase Letter 11558358
31.6%
Space Separator 3852786
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3852786
18.2%
o 3852786
18.2%
x 1926393
9.1%
a 1926393
9.1%
n 1926393
9.1%
i 1926393
9.1%
l 1926393
9.1%
g 1926393
9.1%
y 1926393
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 3852786
33.3%
M 1926393
16.7%
H 1926393
16.7%
E 1926393
16.7%
B 1926393
16.7%
Space Separator
ValueCountFrequency (%)
3852786
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32748681
89.5%
Common 3852786
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 3852786
11.8%
t 3852786
11.8%
o 3852786
11.8%
M 1926393
 
5.9%
H 1926393
 
5.9%
E 1926393
 
5.9%
x 1926393
 
5.9%
a 1926393
 
5.9%
n 1926393
 
5.9%
B 1926393
 
5.9%
Other values (4) 7705572
23.5%
Common
ValueCountFrequency (%)
3852786
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36601467
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 3852786
 
10.5%
3852786
 
10.5%
t 3852786
 
10.5%
o 3852786
 
10.5%
M 1926393
 
5.3%
H 1926393
 
5.3%
E 1926393
 
5.3%
x 1926393
 
5.3%
a 1926393
 
5.3%
n 1926393
 
5.3%
Other values (5) 9631965
26.3%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:36.511333image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length18.00144052
Min length17

Characters and Unicode

Total characters34677849
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 1922256
99.8%
machine_observation 3456
 
0.2%
human_observation 681
 
< 0.1%
2025-01-08T17:52:36.618192image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 9618873
27.7%
R 3848649
11.1%
S 3848649
11.1%
P 3844512
 
11.1%
N 1930530
 
5.6%
I 1929849
 
5.6%
_ 1926393
 
5.6%
M 1926393
 
5.6%
V 1926393
 
5.6%
C 1925712
 
5.6%
Other values (7) 1951896
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 32751456
94.4%
Connector Punctuation 1926393
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 9618873
29.4%
R 3848649
11.8%
S 3848649
11.8%
P 3844512
 
11.7%
N 1930530
 
5.9%
I 1929849
 
5.9%
M 1926393
 
5.9%
V 1926393
 
5.9%
C 1925712
 
5.9%
D 1922256
 
5.9%
Other values (6) 29640
 
0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1926393
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32751456
94.4%
Common 1926393
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 9618873
29.4%
R 3848649
11.8%
S 3848649
11.8%
P 3844512
 
11.7%
N 1930530
 
5.9%
I 1929849
 
5.9%
M 1926393
 
5.9%
V 1926393
 
5.9%
C 1925712
 
5.9%
D 1922256
 
5.9%
Other values (6) 29640
 
0.1%
Common
ValueCountFrequency (%)
_ 1926393
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34677849
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 9618873
27.7%
R 3848649
11.1%
S 3848649
11.1%
P 3844512
 
11.1%
N 1930530
 
5.6%
I 1929849
 
5.6%
_ 1926393
 
5.6%
M 1926393
 
5.6%
V 1926393
 
5.6%
C 1925712
 
5.6%
Other values (7) 1951896
 
5.6%

occurrenceID
Text

Unique 

Distinct1926393
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:37.437910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters121362759
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1926393 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3c831e8df-8799-47a1-8dcf-bcb0b77fd3e3
2nd rowhttp://n2t.net/ark:/65665/383ab647e-23a7-4086-b71e-36212ccc0eb2
3rd rowhttp://n2t.net/ark:/65665/383adbf6e-f769-4dc3-8bef-550530af49ee
4th rowhttp://n2t.net/ark:/65665/3c83aad38-c935-46fa-96c3-e450ebb169cf
5th rowhttp://n2t.net/ark:/65665/383b126a6-bf3a-4908-bc33-e4435555fcc5
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3c831e8df-8799-47a1-8dcf-bcb0b77fd3e3 1
 
< 0.1%
http://n2t.net/ark:/65665/3c8609028-15fe-4982-820a-6e4cef3b3db1 1
 
< 0.1%
http://n2t.net/ark:/65665/383b126a6-bf3a-4908-bc33-e4435555fcc5 1
 
< 0.1%
http://n2t.net/ark:/65665/3c843fd56-7874-4858-b938-14fdfcb5544c 1
 
< 0.1%
http://n2t.net/ark:/65665/383bcb698-5477-4feb-9966-d9adae345f09 1
 
< 0.1%
http://n2t.net/ark:/65665/383bfd766-40bc-4ede-82ca-0df3775130f3 1
 
< 0.1%
http://n2t.net/ark:/65665/3c84cf22c-2b9b-49fb-91ed-f85efd9e9fa7 1
 
< 0.1%
http://n2t.net/ark:/65665/383cb8e2a-4f46-4138-82be-3d7989851c9e 1
 
< 0.1%
http://n2t.net/ark:/65665/3c856104b-9825-44b9-8b57-e69b58510bf8 1
 
< 0.1%
http://n2t.net/ark:/65665/3c856ef4e-b135-45c8-8511-c533777f0d7a 1
 
< 0.1%
Other values (1926383) 1926383
> 99.9%
2025-01-08T17:52:38.335869image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 9631965
 
7.9%
6 9394397
 
7.7%
- 7705572
 
6.3%
t 7705572
 
6.3%
5 7461179
 
6.1%
a 6018602
 
5.0%
3 5539470
 
4.6%
e 5537694
 
4.6%
2 5537394
 
4.6%
4 5534549
 
4.6%
Other values (16) 51296365
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52493612
43.3%
Lowercase Letter 45752431
37.7%
Other Punctuation 15411144
 
12.7%
Dash Punctuation 7705572
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 7705572
16.8%
a 6018602
13.2%
e 5537694
12.1%
b 4095432
9.0%
n 3852786
8.4%
d 3615504
7.9%
c 3611680
7.9%
f 3609589
7.9%
k 1926393
 
4.2%
r 1926393
 
4.2%
Other values (2) 3852786
8.4%
Decimal Number
ValueCountFrequency (%)
6 9394397
17.9%
5 7461179
14.2%
3 5539470
10.6%
2 5537394
10.5%
4 5534549
10.5%
8 4095501
7.8%
9 4095255
7.8%
1 3613792
 
6.9%
7 3611338
 
6.9%
0 3610737
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 9631965
62.5%
: 3852786
 
25.0%
. 1926393
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 7705572
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 75610328
62.3%
Latin 45752431
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 9631965
12.7%
6 9394397
12.4%
- 7705572
10.2%
5 7461179
9.9%
3 5539470
7.3%
2 5537394
7.3%
4 5534549
7.3%
8 4095501
 
5.4%
9 4095255
 
5.4%
: 3852786
 
5.1%
Other values (4) 12762260
16.9%
Latin
ValueCountFrequency (%)
t 7705572
16.8%
a 6018602
13.2%
e 5537694
12.1%
b 4095432
9.0%
n 3852786
8.4%
d 3615504
7.9%
c 3611680
7.9%
f 3609589
7.9%
k 1926393
 
4.2%
r 1926393
 
4.2%
Other values (2) 3852786
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121362759
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 9631965
 
7.9%
6 9394397
 
7.7%
- 7705572
 
6.3%
t 7705572
 
6.3%
5 7461179
 
6.1%
a 6018602
 
5.0%
3 5539470
 
4.6%
e 5537694
 
4.6%
2 5537394
 
4.6%
4 5534549
 
4.6%
Other values (16) 51296365
42.3%
Distinct1355393
Distinct (%)70.4%
Missing5
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:52:39.083984image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11
Mean length11.0374042
Min length6

Characters and Unicode

Total characters21262323
Distinct characters63
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1024476 ?
Unique (%)53.2%

Sample

1st rowUSNM 1119015
2nd rowUSNM 55168
3rd rowUSNM 52536
4th rowUSNM E40844
5th rowUSNM 241160
ValueCountFrequency (%)
usnm 1926388
50.0%
31
 
< 0.1%
284908 16
 
< 0.1%
653324 13
 
< 0.1%
5357 11
 
< 0.1%
15490 10
 
< 0.1%
22869 10
 
< 0.1%
859036 10
 
< 0.1%
224878 10
 
< 0.1%
40969 9
 
< 0.1%
Other values (1352149) 1926301
50.0%
2025-01-08T17:52:39.888371image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1928507
 
9.1%
U 1926495
 
9.1%
1926421
 
9.1%
S 1926388
 
9.1%
N 1926388
 
9.1%
1 1809864
 
8.5%
2 1247566
 
5.9%
3 1147864
 
5.4%
4 1110834
 
5.2%
5 1088355
 
5.1%
Other values (53) 5223641
24.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11557547
54.4%
Uppercase Letter 7763402
36.5%
Space Separator 1926421
 
9.1%
Lowercase Letter 11690
 
0.1%
Other Punctuation 3259
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8276
70.8%
b 1739
 
14.9%
c 637
 
5.4%
d 326
 
2.8%
e 206
 
1.8%
f 143
 
1.2%
g 87
 
0.7%
h 61
 
0.5%
i 40
 
0.3%
j 35
 
0.3%
Other values (16) 140
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
M 1928507
24.8%
U 1926495
24.8%
S 1926388
24.8%
N 1926388
24.8%
E 53455
 
0.7%
I 778
 
< 0.1%
A 697
 
< 0.1%
X 326
 
< 0.1%
B 177
 
< 0.1%
D 128
 
< 0.1%
Other values (10) 63
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1809864
15.7%
2 1247566
10.8%
3 1147864
9.9%
4 1110834
9.6%
5 1088355
9.4%
8 1073474
9.3%
6 1062349
9.2%
7 1058933
9.2%
0 1002104
8.7%
9 956204
8.3%
Other Punctuation
ValueCountFrequency (%)
* 3252
99.8%
. 6
 
0.2%
& 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1926421
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13487231
63.4%
Latin 7775092
36.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1928507
24.8%
U 1926495
24.8%
S 1926388
24.8%
N 1926388
24.8%
E 53455
 
0.7%
a 8276
 
0.1%
b 1739
 
< 0.1%
I 778
 
< 0.1%
A 697
 
< 0.1%
c 637
 
< 0.1%
Other values (36) 1732
 
< 0.1%
Common
ValueCountFrequency (%)
1926421
14.3%
1 1809864
13.4%
2 1247566
9.2%
3 1147864
8.5%
4 1110834
8.2%
5 1088355
8.1%
8 1073474
8.0%
6 1062349
7.9%
7 1058933
7.9%
0 1002104
7.4%
Other values (7) 959467
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21262323
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1928507
 
9.1%
U 1926495
 
9.1%
1926421
 
9.1%
S 1926388
 
9.1%
N 1926388
 
9.1%
1 1809864
 
8.5%
2 1247566
 
5.9%
3 1147864
 
5.4%
4 1110834
 
5.2%
5 1088355
 
5.1%
Other values (53) 5223641
24.6%

recordNumber
Text

Missing 

Distinct119495
Distinct (%)98.1%
Missing1804640
Missing (%)93.7%
Memory size14.7 MiB
2025-01-08T17:52:40.082576image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length14
Mean length13.17353166
Min length1

Characters and Unicode

Total characters1603917
Distinct characters81
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118866 ?
Unique (%)97.6%

Sample

1st rowUSNPC # 001298
2nd rowFPlrv_430
3rd rowH-2284
4th rowUSNPC # 066527
5th rowUSNPC # 009815
ValueCountFrequency (%)
88145
28.7%
usnpc 88064
28.6%
ullz 5209
 
1.7%
rh 1566
 
0.5%
k-rh 1555
 
0.5%
ce16007-event 223
 
0.1%
2208 102
 
< 0.1%
1430 92
 
< 0.1%
1513 80
 
< 0.1%
beauty 75
 
< 0.1%
Other values (119414) 122317
39.8%
2025-01-08T17:52:40.327228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
185675
 
11.6%
0 161175
 
10.0%
C 97557
 
6.1%
S 95231
 
5.9%
U 94869
 
5.9%
P 94146
 
5.9%
N 93453
 
5.8%
# 88221
 
5.5%
1 83004
 
5.2%
2 65151
 
4.1%
Other values (71) 545435
34.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 709687
44.2%
Uppercase Letter 576534
35.9%
Space Separator 185675
 
11.6%
Other Punctuation 91637
 
5.7%
Dash Punctuation 15241
 
1.0%
Connector Punctuation 14091
 
0.9%
Lowercase Letter 10490
 
0.7%
Close Punctuation 281
 
< 0.1%
Open Punctuation 271
 
< 0.1%
Math Symbol 10
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 97557
16.9%
S 95231
16.5%
U 94869
16.5%
P 94146
16.3%
N 93453
16.2%
L 12317
 
2.1%
E 11806
 
2.0%
R 10316
 
1.8%
I 7528
 
1.3%
B 7241
 
1.3%
Other values (16) 52070
9.0%
Lowercase Letter
ValueCountFrequency (%)
l 1416
13.5%
v 1363
13.0%
a 1349
12.9%
r 1268
12.1%
t 873
8.3%
e 713
6.8%
s 657
 
6.3%
n 489
 
4.7%
c 300
 
2.9%
i 287
 
2.7%
Other values (16) 1775
16.9%
Decimal Number
ValueCountFrequency (%)
0 161175
22.7%
1 83004
11.7%
2 65151
9.2%
6 58928
 
8.3%
3 58890
 
8.3%
7 58489
 
8.2%
4 56685
 
8.0%
8 56225
 
7.9%
9 55924
 
7.9%
5 55216
 
7.8%
Other Punctuation
ValueCountFrequency (%)
# 88221
96.3%
. 2352
 
2.6%
: 559
 
0.6%
, 400
 
0.4%
; 65
 
0.1%
/ 20
 
< 0.1%
& 10
 
< 0.1%
? 7
 
< 0.1%
* 3
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 15240
> 99.9%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 273
97.2%
] 8
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 263
97.0%
[ 8
 
3.0%
Math Symbol
ValueCountFrequency (%)
+ 5
50.0%
= 5
50.0%
Space Separator
ValueCountFrequency (%)
185675
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14091
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1016893
63.4%
Latin 587024
36.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 97557
16.6%
S 95231
16.2%
U 94869
16.2%
P 94146
16.0%
N 93453
15.9%
L 12317
 
2.1%
E 11806
 
2.0%
R 10316
 
1.8%
I 7528
 
1.3%
B 7241
 
1.2%
Other values (42) 62560
10.7%
Common
ValueCountFrequency (%)
185675
18.3%
0 161175
15.8%
# 88221
8.7%
1 83004
8.2%
2 65151
 
6.4%
6 58928
 
5.8%
3 58890
 
5.8%
7 58489
 
5.8%
4 56685
 
5.6%
8 56225
 
5.5%
Other values (19) 144450
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1603916
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
185675
 
11.6%
0 161175
 
10.0%
C 97557
 
6.1%
S 95231
 
5.9%
U 94869
 
5.9%
P 94146
 
5.9%
N 93453
 
5.8%
# 88221
 
5.5%
1 83004
 
5.2%
2 65151
 
4.1%
Other values (70) 545434
34.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

recordedBy
Text

Missing 

Distinct37540
Distinct (%)3.2%
Missing764111
Missing (%)39.7%
Memory size14.7 MiB
2025-01-08T17:52:40.510526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24975
Median length156
Mean length23.05844881
Min length1

Characters and Unicode

Total characters26800420
Distinct characters99
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16583 ?
Unique (%)1.4%

Sample

1st rowVIMS for BLM/ MMS
2nd rowLgl Ecological Research Associates/ Environmental Science And Engineering For BLM/ MMS
3rd rowUniversity of Southern California
4th rowUnited States Fish Commission
5th rowUnited States Fish Commission
ValueCountFrequency (%)
mms 181011
 
4.2%
blm 181009
 
4.2%
for 178053
 
4.2%
fish 168374
 
3.9%
united 164153
 
3.8%
states 163489
 
3.8%
commission 157086
 
3.7%
149581
 
3.5%
of 101785
 
2.4%
j 101464
 
2.4%
Other values (19944) 2737862
63.9%
2025-01-08T17:52:40.768409image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3119233
 
11.6%
e 2082533
 
7.8%
i 1879315
 
7.0%
n 1616256
 
6.0%
t 1592703
 
5.9%
o 1549732
 
5.8%
s 1530048
 
5.7%
a 1499473
 
5.6%
r 1221276
 
4.6%
M 808831
 
3.0%
Other values (89) 9901020
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17592737
65.6%
Uppercase Letter 4868630
 
18.2%
Space Separator 3119233
 
11.6%
Other Punctuation 1145135
 
4.3%
Dash Punctuation 53764
 
0.2%
Decimal Number 11194
 
< 0.1%
Control 7851
 
< 0.1%
Open Punctuation 688
 
< 0.1%
Close Punctuation 688
 
< 0.1%
Connector Punctuation 479
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2082533
11.8%
i 1879315
10.7%
n 1616256
9.2%
t 1592703
9.1%
o 1549732
8.8%
s 1530048
8.7%
a 1499473
8.5%
r 1221276
 
6.9%
l 768101
 
4.4%
h 563773
 
3.2%
Other values (31) 3289527
18.7%
Uppercase Letter
ValueCountFrequency (%)
M 808831
16.6%
S 654014
13.4%
B 397882
 
8.2%
C 365204
 
7.5%
F 349332
 
7.2%
L 335906
 
6.9%
U 267323
 
5.5%
H 212626
 
4.4%
R 189118
 
3.9%
W 154368
 
3.2%
Other values (17) 1134026
23.3%
Other Punctuation
ValueCountFrequency (%)
. 741943
64.8%
/ 238289
 
20.8%
& 118079
 
10.3%
, 45735
 
4.0%
: 572
 
< 0.1%
' 383
 
< 0.1%
; 79
 
< 0.1%
" 40
 
< 0.1%
? 11
 
< 0.1%
# 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 2213
19.8%
1 1914
17.1%
0 1548
13.8%
9 1226
11.0%
4 936
8.4%
8 714
 
6.4%
5 696
 
6.2%
6 693
 
6.2%
3 681
 
6.1%
7 573
 
5.1%
Control
ValueCountFrequency (%)
7816
99.6%
35
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 686
99.7%
{ 2
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 686
99.7%
} 2
 
0.3%
Space Separator
ValueCountFrequency (%)
3119233
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 53764
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 479
100.0%
Math Symbol
ValueCountFrequency (%)
+ 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22461367
83.8%
Common 4339053
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2082533
 
9.3%
i 1879315
 
8.4%
n 1616256
 
7.2%
t 1592703
 
7.1%
o 1549732
 
6.9%
s 1530048
 
6.8%
a 1499473
 
6.7%
r 1221276
 
5.4%
M 808831
 
3.6%
l 768101
 
3.4%
Other values (58) 7913099
35.2%
Common
ValueCountFrequency (%)
3119233
71.9%
. 741943
 
17.1%
/ 238289
 
5.5%
& 118079
 
2.7%
- 53764
 
1.2%
, 45735
 
1.1%
7816
 
0.2%
2 2213
 
0.1%
1 1914
 
< 0.1%
0 1548
 
< 0.1%
Other values (21) 8519
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26799498
> 99.9%
None 922
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3119233
 
11.6%
e 2082533
 
7.8%
i 1879315
 
7.0%
n 1616256
 
6.0%
t 1592703
 
5.9%
o 1549732
 
5.8%
s 1530048
 
5.7%
a 1499473
 
5.6%
r 1221276
 
4.6%
M 808831
 
3.0%
Other values (73) 9900098
36.9%
None
ValueCountFrequency (%)
é 455
49.3%
ü 102
 
11.1%
á 93
 
10.1%
ö 65
 
7.0%
ä 57
 
6.2%
ó 53
 
5.7%
í 49
 
5.3%
è 15
 
1.6%
ñ 12
 
1.3%
ç 9
 
1.0%
Other values (6) 12
 
1.3%
Distinct1067
Distinct (%)0.1%
Missing156
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:52:40.927020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.108392166
Min length1

Characters and Unicode

Total characters2135026
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique413 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row11
3rd row1
4th row26
5th row1
ValueCountFrequency (%)
1 995782
51.7%
2 289569
 
15.0%
3 135771
 
7.0%
4 99105
 
5.1%
5 73928
 
3.8%
6 51745
 
2.7%
10 38953
 
2.0%
7 31375
 
1.6%
8 30170
 
1.6%
9 18501
 
1.0%
Other values (1057) 161338
 
8.4%
2025-01-08T17:52:41.130671image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1131608
53.0%
2 345490
 
16.2%
3 162143
 
7.6%
4 118965
 
5.6%
5 110284
 
5.2%
0 93507
 
4.4%
6 64569
 
3.0%
7 42177
 
2.0%
8 40055
 
1.9%
9 26228
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2135026
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1131608
53.0%
2 345490
 
16.2%
3 162143
 
7.6%
4 118965
 
5.6%
5 110284
 
5.2%
0 93507
 
4.4%
6 64569
 
3.0%
7 42177
 
2.0%
8 40055
 
1.9%
9 26228
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common 2135026
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1131608
53.0%
2 345490
 
16.2%
3 162143
 
7.6%
4 118965
 
5.6%
5 110284
 
5.2%
0 93507
 
4.4%
6 64569
 
3.0%
7 42177
 
2.0%
8 40055
 
1.9%
9 26228
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2135026
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1131608
53.0%
2 345490
 
16.2%
3 162143
 
7.6%
4 118965
 
5.6%
5 110284
 
5.2%
0 93507
 
4.4%
6 64569
 
3.0%
7 42177
 
2.0%
8 40055
 
1.9%
9 26228
 
1.2%

sex
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing1802980
Missing (%)93.6%
Memory size14.7 MiB
2025-01-08T17:52:41.177683image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length6
Mean length5.129864763
Min length4

Characters and Unicode

Total characters633092
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFEMALE
2nd rowFEMALE
3rd rowMALE
4th rowMALE
5th rowFEMALE
ValueCountFrequency (%)
female 68541
55.5%
male 54610
44.2%
hermaphrodite 262
 
0.2%
2025-01-08T17:52:41.271215image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 192216
30.4%
M 123413
19.5%
A 123413
19.5%
L 123151
19.5%
F 68541
 
10.8%
H 524
 
0.1%
R 524
 
0.1%
P 262
 
< 0.1%
O 262
 
< 0.1%
D 262
 
< 0.1%
Other values (2) 524
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 633092
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 192216
30.4%
M 123413
19.5%
A 123413
19.5%
L 123151
19.5%
F 68541
 
10.8%
H 524
 
0.1%
R 524
 
0.1%
P 262
 
< 0.1%
O 262
 
< 0.1%
D 262
 
< 0.1%
Other values (2) 524
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 633092
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 192216
30.4%
M 123413
19.5%
A 123413
19.5%
L 123151
19.5%
F 68541
 
10.8%
H 524
 
0.1%
R 524
 
0.1%
P 262
 
< 0.1%
O 262
 
< 0.1%
D 262
 
< 0.1%
Other values (2) 524
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 633092
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 192216
30.4%
M 123413
19.5%
A 123413
19.5%
L 123151
19.5%
F 68541
 
10.8%
H 524
 
0.1%
R 524
 
0.1%
P 262
 
< 0.1%
O 262
 
< 0.1%
D 262
 
< 0.1%
Other values (2) 524
 
0.1%

lifeStage
Text

Missing 

Distinct19
Distinct (%)0.1%
Missing1888856
Missing (%)98.1%
Memory size14.7 MiB
2025-01-08T17:52:41.321006image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.544262994
Min length3

Characters and Unicode

Total characters245652
Distinct characters35
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLarva
2nd rowJuvenile
3rd rowLarva
4th rowJuvenile
5th rowLarva
ValueCountFrequency (%)
juvenile 18119
48.3%
adult 9874
26.3%
larva 7695
20.5%
immature 711
 
1.9%
mature 247
 
0.7%
subadult 244
 
0.7%
egg 142
 
0.4%
megalopa 131
 
0.3%
veliger 126
 
0.3%
zoea 95
 
0.3%
Other values (9) 153
 
0.4%
2025-01-08T17:52:41.430001image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 37685
15.3%
u 29584
12.0%
l 28565
11.6%
v 25814
10.5%
i 18319
7.5%
n 18135
7.4%
J 18119
7.4%
a 17028
6.9%
t 11097
 
4.5%
d 10135
 
4.1%
Other values (25) 31171
12.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 208115
84.7%
Uppercase Letter 37537
 
15.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 37685
18.1%
u 29584
14.2%
l 28565
13.7%
v 25814
12.4%
i 18319
8.8%
n 18135
8.7%
a 17028
8.2%
t 11097
 
5.3%
d 10135
 
4.9%
r 8805
 
4.2%
Other values (11) 2948
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
J 18119
48.3%
A 9874
26.3%
L 7695
20.5%
I 711
 
1.9%
M 389
 
1.0%
S 244
 
0.7%
E 162
 
0.4%
V 126
 
0.3%
Z 95
 
0.3%
N 87
 
0.2%
Other values (4) 35
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 245652
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 37685
15.3%
u 29584
12.0%
l 28565
11.6%
v 25814
10.5%
i 18319
7.5%
n 18135
7.4%
J 18119
7.4%
a 17028
6.9%
t 11097
 
4.5%
d 10135
 
4.1%
Other values (25) 31171
12.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 245652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 37685
15.3%
u 29584
12.0%
l 28565
11.6%
v 25814
10.5%
i 18319
7.5%
n 18135
7.4%
J 18119
7.4%
a 17028
6.9%
t 11097
 
4.5%
d 10135
 
4.1%
Other values (25) 31171
12.7%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-08T17:52:41.472001image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.997880495
Min length6

Characters and Unicode

Total characters13480668
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 1922302
99.8%
absent 4089
 
0.2%
1993-09-09 1
 
< 0.1%
1938-09-22 1
 
< 0.1%
2025-01-08T17:52:41.573664image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 3848693
28.5%
S 1926391
14.3%
N 1926391
14.3%
T 1926391
14.3%
P 1922302
14.3%
R 1922302
14.3%
A 4089
 
< 0.1%
B 4089
 
< 0.1%
9 6
 
< 0.1%
- 4
 
< 0.1%
Other values (5) 10
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 13480648
> 99.9%
Decimal Number 16
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3848693
28.5%
S 1926391
14.3%
N 1926391
14.3%
T 1926391
14.3%
P 1922302
14.3%
R 1922302
14.3%
A 4089
 
< 0.1%
B 4089
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 6
37.5%
0 3
18.8%
1 2
 
12.5%
3 2
 
12.5%
2 2
 
12.5%
8 1
 
6.2%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13480648
> 99.9%
Common 20
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3848693
28.5%
S 1926391
14.3%
N 1926391
14.3%
T 1926391
14.3%
P 1922302
14.3%
R 1922302
14.3%
A 4089
 
< 0.1%
B 4089
 
< 0.1%
Common
ValueCountFrequency (%)
9 6
30.0%
- 4
20.0%
0 3
15.0%
1 2
 
10.0%
3 2
 
10.0%
2 2
 
10.0%
8 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13480668
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 3848693
28.5%
S 1926391
14.3%
N 1926391
14.3%
T 1926391
14.3%
P 1922302
14.3%
R 1922302
14.3%
A 4089
 
< 0.1%
B 4089
 
< 0.1%
9 6
 
< 0.1%
- 4
 
< 0.1%
Other values (5) 10
 
< 0.1%
Distinct527
Distinct (%)< 0.1%
Missing1860
Missing (%)0.1%
Memory size14.7 MiB
2025-01-08T17:52:41.628017image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length167
Median length157
Mean length10.12228005
Min length3

Characters and Unicode

Total characters19480662
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique212 ?
Unique (%)< 0.1%

Sample

1st rowAlcohol (Ethanol)
2nd rowDry
3rd rowAlcohol (Ethanol)
4th rowDry
5th rowDry
ValueCountFrequency (%)
ethanol 907118
30.8%
dry 902342
30.6%
alcohol 897625
30.5%
slide 129646
 
4.4%
19548
 
0.7%
95 16839
 
0.6%
formalin 12585
 
0.4%
biorepository 12373
 
0.4%
isopropyl 10055
 
0.3%
sorting 6036
 
0.2%
Other values (40) 31872
 
1.1%
2025-01-08T17:52:41.755924image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2866431
14.7%
o 2797187
14.4%
h 1806308
 
9.3%
1021506
 
5.2%
r 954329
 
4.9%
t 939560
 
4.8%
n 936854
 
4.8%
a 925743
 
4.8%
y 923987
 
4.7%
E 913018
 
4.7%
Other values (43) 5395739
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13613226
69.9%
Uppercase Letter 2925118
 
15.0%
Space Separator 1021506
 
5.2%
Close Punctuation 887570
 
4.6%
Open Punctuation 887570
 
4.6%
Other Punctuation 86959
 
0.4%
Decimal Number 39165
 
0.2%
Dash Punctuation 19548
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2866431
21.1%
o 2797187
20.5%
h 1806308
13.3%
r 954329
 
7.0%
t 939560
 
6.9%
n 936854
 
6.9%
a 925743
 
6.8%
y 923987
 
6.8%
c 898646
 
6.6%
i 181357
 
1.3%
Other values (12) 382824
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
E 913018
31.2%
D 902616
30.9%
A 898725
30.7%
S 153320
 
5.2%
I 13804
 
0.5%
F 12984
 
0.4%
B 12731
 
0.4%
M 5938
 
0.2%
R 4592
 
0.2%
Y 4591
 
0.2%
Other values (9) 2799
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 18431
47.1%
5 17782
45.4%
0 1802
 
4.6%
8 1081
 
2.8%
1 36
 
0.1%
2 33
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 67410
77.5%
% 19549
 
22.5%
Space Separator
ValueCountFrequency (%)
1021506
100.0%
Close Punctuation
ValueCountFrequency (%)
) 887570
100.0%
Open Punctuation
ValueCountFrequency (%)
( 887570
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19548
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16538344
84.9%
Common 2942318
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2866431
17.3%
o 2797187
16.9%
h 1806308
10.9%
r 954329
 
5.8%
t 939560
 
5.7%
n 936854
 
5.7%
a 925743
 
5.6%
y 923987
 
5.6%
E 913018
 
5.5%
D 902616
 
5.5%
Other values (31) 2572311
15.6%
Common
ValueCountFrequency (%)
1021506
34.7%
) 887570
30.2%
( 887570
30.2%
; 67410
 
2.3%
% 19549
 
0.7%
- 19548
 
0.7%
9 18431
 
0.6%
5 17782
 
0.6%
0 1802
 
0.1%
8 1081
 
< 0.1%
Other values (2) 69
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19480662
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2866431
14.7%
o 2797187
14.4%
h 1806308
 
9.3%
1021506
 
5.2%
r 954329
 
4.9%
t 939560
 
4.8%
n 936854
 
4.8%
a 925743
 
4.8%
y 923987
 
4.7%
E 913018
 
4.7%
Other values (43) 5395739
27.7%

disposition
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:41.797924image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row252
2nd row265
ValueCountFrequency (%)
252 1
50.0%
265 1
50.0%
2025-01-08T17:52:41.887585image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

associatedOccurrences
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:41.927186image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row252
2nd row265
ValueCountFrequency (%)
252 1
50.0%
265 1
50.0%
2025-01-08T17:52:42.010186image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

associatedReferences
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:42.050863image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1993
2nd row1938
ValueCountFrequency (%)
1993 1
50.0%
1938 1
50.0%
2025-01-08T17:52:42.139327image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

associatedSequences
Text

Missing 

Distinct5098
Distinct (%)99.5%
Missing1921269
Missing (%)99.7%
Memory size14.7 MiB
2025-01-08T17:52:42.217922image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1349
Median length49
Mean length85.4980484
Min length1

Characters and Unicode

Total characters438092
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5082 ?
Unique (%)99.2%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=AY426351;https://www.ncbi.nlm.nih.gov/gquery?term=AY379442;https://www.ncbi.nlm.nih.gov/gquery?term=AY426385
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MH825989
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MT223244
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MH826372
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KT792656
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=km521547 12
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=kx362316;https://www.ncbi.nlm.nih.gov/gquery?term=kx362269 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ef060028;https://www.ncbi.nlm.nih.gov/gquery?term=kx362271 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj172481 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=srr9613700 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kx832080 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mk246581;https://www.ncbi.nlm.nih.gov/gquery?term=mk246487 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jq307001 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay643524 2
 
< 0.1%
9 2
 
< 0.1%
Other values (5088) 5094
99.4%
2025-01-08T17:52:42.372301image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 35419
 
8.1%
t 26562
 
6.1%
/ 26562
 
6.1%
w 26562
 
6.1%
n 26562
 
6.1%
h 17708
 
4.0%
r 17708
 
4.0%
i 17708
 
4.0%
e 17708
 
4.0%
m 17708
 
4.0%
Other values (51) 207885
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 274474
62.7%
Other Punctuation 83421
 
19.0%
Decimal Number 53457
 
12.2%
Uppercase Letter 17884
 
4.1%
Math Symbol 8854
 
2.0%
Dash Punctuation 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 3906
21.8%
M 3764
21.0%
W 1587
8.9%
U 1539
 
8.6%
F 833
 
4.7%
J 772
 
4.3%
X 719
 
4.0%
C 697
 
3.9%
T 538
 
3.0%
H 533
 
3.0%
Other values (14) 2996
16.8%
Lowercase Letter
ValueCountFrequency (%)
t 26562
 
9.7%
w 26562
 
9.7%
n 26562
 
9.7%
h 17708
 
6.5%
r 17708
 
6.5%
i 17708
 
6.5%
e 17708
 
6.5%
m 17708
 
6.5%
g 17708
 
6.5%
q 8854
 
3.2%
Other values (9) 79686
29.0%
Decimal Number
ValueCountFrequency (%)
2 7334
13.7%
8 6190
11.6%
0 5590
10.5%
4 5209
9.7%
6 5207
9.7%
5 5041
9.4%
3 4920
9.2%
9 4839
9.1%
1 4744
8.9%
7 4383
8.2%
Other Punctuation
ValueCountFrequency (%)
. 35419
42.5%
/ 26562
31.8%
? 8854
 
10.6%
: 8854
 
10.6%
; 3732
 
4.5%
Math Symbol
ValueCountFrequency (%)
= 8854
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 292358
66.7%
Common 145734
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 26562
 
9.1%
w 26562
 
9.1%
n 26562
 
9.1%
h 17708
 
6.1%
r 17708
 
6.1%
i 17708
 
6.1%
e 17708
 
6.1%
m 17708
 
6.1%
g 17708
 
6.1%
q 8854
 
3.0%
Other values (33) 97570
33.4%
Common
ValueCountFrequency (%)
. 35419
24.3%
/ 26562
18.2%
= 8854
 
6.1%
? 8854
 
6.1%
: 8854
 
6.1%
2 7334
 
5.0%
8 6190
 
4.2%
0 5590
 
3.8%
4 5209
 
3.6%
6 5207
 
3.6%
Other values (8) 27661
19.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 438092
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 35419
 
8.1%
t 26562
 
6.1%
/ 26562
 
6.1%
w 26562
 
6.1%
n 26562
 
6.1%
h 17708
 
4.0%
r 17708
 
4.0%
i 17708
 
4.0%
e 17708
 
4.0%
m 17708
 
4.0%
Other values (51) 207885
47.5%

associatedTaxa
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:42.420466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1.5
Mean length1.5
Min length1

Characters and Unicode

Total characters3
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row9
2nd row22
ValueCountFrequency (%)
9 1
50.0%
22 1
50.0%
2025-01-08T17:52:42.513244image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2
66.7%
9 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2
66.7%
9 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2
66.7%
9 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2
66.7%
9 1
33.3%

occurrenceRemarks
Text

Missing 

Distinct384906
Distinct (%)49.2%
Missing1144485
Missing (%)59.4%
Memory size14.7 MiB
2025-01-08T17:52:42.793269image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48983
Median length1371
Mean length61.51201036
Min length1

Characters and Unicode

Total characters48096733
Distinct characters133
Distinct categories18 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique322690 ?
Unique (%)41.3%

Sample

1st rowJewett.; Stearns.
2nd rowBartsch
3rd row15 Nov. 1973; Jones, Dawson, del Rosario; Fitzgerald; NMNH-STRI Survey
4th rowU. S. B. Fish
5th rowC.R. Laws
ValueCountFrequency (%)
coll 143199
 
2.1%
of 115369
 
1.7%
and 111363
 
1.7%
a 107288
 
1.6%
by 89612
 
1.3%
87811
 
1.3%
2 65618
 
1.0%
3 63129
 
0.9%
was 62154
 
0.9%
formalin 58892
 
0.9%
Other values (238105) 5777747
86.5%
2025-01-08T17:52:43.161531image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5892887
 
12.3%
e 2965997
 
6.2%
o 2602001
 
5.4%
a 2414749
 
5.0%
i 2010061
 
4.2%
t 1978195
 
4.1%
n 1975689
 
4.1%
r 1877425
 
3.9%
s 1858443
 
3.9%
l 1812957
 
3.8%
Other values (123) 22708329
47.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27332023
56.8%
Space Separator 5892887
 
12.3%
Uppercase Letter 5695540
 
11.8%
Other Punctuation 5002846
 
10.4%
Decimal Number 3447946
 
7.2%
Dash Punctuation 299803
 
0.6%
Open Punctuation 185687
 
0.4%
Close Punctuation 185536
 
0.4%
Control 24753
 
0.1%
Math Symbol 15128
 
< 0.1%
Other values (8) 14584
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2965997
10.9%
o 2602001
 
9.5%
a 2414749
 
8.8%
i 2010061
 
7.4%
t 1978195
 
7.2%
n 1975689
 
7.2%
r 1877425
 
6.9%
s 1858443
 
6.8%
l 1812957
 
6.6%
d 1161749
 
4.3%
Other values (32) 6674757
24.4%
Uppercase Letter
ValueCountFrequency (%)
C 697325
 
12.2%
S 676219
 
11.9%
B 359074
 
6.3%
F 347516
 
6.1%
P 326048
 
5.7%
N 312938
 
5.5%
M 290198
 
5.1%
A 263171
 
4.6%
R 240076
 
4.2%
H 232062
 
4.1%
Other values (17) 1950913
34.3%
Other Punctuation
ValueCountFrequency (%)
" 1194440
23.9%
. 1192250
23.8%
; 1044076
20.9%
, 582555
11.6%
: 568537
11.4%
% 166891
 
3.3%
/ 97157
 
1.9%
! 65397
 
1.3%
' 33800
 
0.7%
# 25850
 
0.5%
Other values (6) 31893
 
0.6%
Decimal Number
ValueCountFrequency (%)
1 686584
19.9%
2 449189
13.0%
9 387732
11.2%
0 371791
10.8%
3 303064
8.8%
7 287058
8.3%
5 256778
 
7.4%
6 251883
 
7.3%
4 239793
 
7.0%
8 214074
 
6.2%
Math Symbol
ValueCountFrequency (%)
+ 11235
74.3%
= 1994
 
13.2%
| 1638
 
10.8%
> 140
 
0.9%
~ 94
 
0.6%
< 23
 
0.2%
± 2
 
< 0.1%
× 2
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 3563
96.0%
91
 
2.5%
49
 
1.3%
6
 
0.2%
© 2
 
0.1%
® 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 299333
99.8%
469
 
0.2%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 95227
51.3%
{ 87819
47.3%
[ 2641
 
1.4%
Close Punctuation
ValueCountFrequency (%)
) 95098
51.3%
} 87813
47.3%
] 2625
 
1.4%
Other Number
ValueCountFrequency (%)
½ 1
33.3%
¼ 1
33.3%
³ 1
33.3%
Control
ValueCountFrequency (%)
24642
99.6%
111
 
0.4%
Currency Symbol
ValueCountFrequency (%)
$ 383
99.5%
2
 
0.5%
Final Punctuation
ValueCountFrequency (%)
213
99.5%
» 1
 
0.5%
Initial Punctuation
ValueCountFrequency (%)
213
99.5%
« 1
 
0.5%
Space Separator
ValueCountFrequency (%)
5892887
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8926
100.0%
Other Letter
ValueCountFrequency (%)
º 1128
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33028655
68.7%
Common 15068070
31.3%
Greek 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2965997
 
9.0%
o 2602001
 
7.9%
a 2414749
 
7.3%
i 2010061
 
6.1%
t 1978195
 
6.0%
n 1975689
 
6.0%
r 1877425
 
5.7%
s 1858443
 
5.6%
l 1812957
 
5.5%
d 1161749
 
3.5%
Other values (57) 12371389
37.5%
Common
ValueCountFrequency (%)
5892887
39.1%
" 1194440
 
7.9%
. 1192250
 
7.9%
; 1044076
 
6.9%
1 686584
 
4.6%
, 582555
 
3.9%
: 568537
 
3.8%
2 449189
 
3.0%
9 387732
 
2.6%
0 371791
 
2.5%
Other values (54) 2698029
17.9%
Greek
ValueCountFrequency (%)
μ 7
87.5%
π 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48090219
> 99.9%
None 5328
 
< 0.1%
Punctuation 1038
 
< 0.1%
Misc Symbols 146
 
< 0.1%
Currency Symbols 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5892887
 
12.3%
e 2965997
 
6.2%
o 2602001
 
5.4%
a 2414749
 
5.0%
i 2010061
 
4.2%
t 1978195
 
4.1%
n 1975689
 
4.1%
r 1877425
 
3.9%
s 1858443
 
3.9%
l 1812957
 
3.8%
Other values (86) 22701815
47.2%
None
ValueCountFrequency (%)
° 3563
66.9%
º 1128
 
21.2%
é 388
 
7.3%
ü 91
 
1.7%
ö 31
 
0.6%
µ 28
 
0.5%
ã 14
 
0.3%
à 12
 
0.2%
ó 11
 
0.2%
á 11
 
0.2%
Other values (18) 51
 
1.0%
Punctuation
ValueCountFrequency (%)
469
45.2%
213
20.5%
213
20.5%
142
 
13.7%
1
 
0.1%
Misc Symbols
ValueCountFrequency (%)
91
62.3%
49
33.6%
6
 
4.1%
Currency Symbols
ValueCountFrequency (%)
2
100.0%

verbatimLabel
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:43.220234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length48.5
Mean length48.5
Min length35

Characters and Unicode

Total characters97
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowNorth America, North Pacific Ocean, Gulf Of California, Mexico
2nd rowNorth America, United States, Texas
ValueCountFrequency (%)
north 3
21.4%
america 2
14.3%
pacific 1
 
7.1%
ocean 1
 
7.1%
gulf 1
 
7.1%
of 1
 
7.1%
california 1
 
7.1%
mexico 1
 
7.1%
united 1
 
7.1%
states 1
 
7.1%
2025-01-08T17:52:43.320666image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
12.4%
a 8
 
8.2%
i 8
 
8.2%
e 7
 
7.2%
r 6
 
6.2%
t 6
 
6.2%
c 6
 
6.2%
o 5
 
5.2%
, 5
 
5.2%
f 4
 
4.1%
Other values (18) 30
30.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66
68.0%
Uppercase Letter 14
 
14.4%
Space Separator 12
 
12.4%
Other Punctuation 5
 
5.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
12.1%
i 8
12.1%
e 7
10.6%
r 6
9.1%
t 6
9.1%
c 6
9.1%
o 5
7.6%
f 4
 
6.1%
n 3
 
4.5%
h 3
 
4.5%
Other values (6) 10
15.2%
Uppercase Letter
ValueCountFrequency (%)
N 3
21.4%
A 2
14.3%
O 2
14.3%
P 1
 
7.1%
G 1
 
7.1%
C 1
 
7.1%
M 1
 
7.1%
U 1
 
7.1%
S 1
 
7.1%
T 1
 
7.1%
Space Separator
ValueCountFrequency (%)
12
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 80
82.5%
Common 17
 
17.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
 
10.0%
i 8
 
10.0%
e 7
 
8.8%
r 6
 
7.5%
t 6
 
7.5%
c 6
 
7.5%
o 5
 
6.2%
f 4
 
5.0%
n 3
 
3.8%
N 3
 
3.8%
Other values (16) 24
30.0%
Common
ValueCountFrequency (%)
12
70.6%
, 5
29.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 97
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12
 
12.4%
a 8
 
8.2%
i 8
 
8.2%
e 7
 
7.2%
r 6
 
6.2%
t 6
 
6.2%
c 6
 
6.2%
o 5
 
5.2%
, 5
 
5.2%
f 4
 
4.1%
Other values (18) 30
30.9%

materialSampleID
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:43.366347image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters26
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 2
100.0%
2025-01-08T17:52:43.456400image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 4
15.4%
A 4
15.4%
N 2
7.7%
O 2
7.7%
T 2
7.7%
H 2
7.7%
_ 2
7.7%
M 2
7.7%
E 2
7.7%
I 2
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 24
92.3%
Connector Punctuation 2
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 4
16.7%
A 4
16.7%
N 2
8.3%
O 2
8.3%
T 2
8.3%
H 2
8.3%
M 2
8.3%
E 2
8.3%
I 2
8.3%
C 2
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 24
92.3%
Common 2
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 4
16.7%
A 4
16.7%
N 2
8.3%
O 2
8.3%
T 2
8.3%
H 2
8.3%
M 2
8.3%
E 2
8.3%
I 2
8.3%
C 2
8.3%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 4
15.4%
A 4
15.4%
N 2
7.7%
O 2
7.7%
T 2
7.7%
H 2
7.7%
_ 2
7.7%
M 2
7.7%
E 2
7.7%
I 2
7.7%

eventID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:43.503922image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length39
Mean length39
Min length39

Characters and Unicode

Total characters39
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNorth Pacific Ocean, Gulf Of California
ValueCountFrequency (%)
north 1
16.7%
pacific 1
16.7%
ocean 1
16.7%
gulf 1
16.7%
of 1
16.7%
california 1
16.7%
2025-01-08T17:52:43.605650image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
12.8%
i 4
10.3%
a 4
10.3%
f 4
10.3%
c 3
 
7.7%
n 2
 
5.1%
r 2
 
5.1%
l 2
 
5.1%
o 2
 
5.1%
O 2
 
5.1%
Other values (9) 9
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27
69.2%
Uppercase Letter 6
 
15.4%
Space Separator 5
 
12.8%
Other Punctuation 1
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
14.8%
a 4
14.8%
f 4
14.8%
c 3
11.1%
n 2
7.4%
r 2
7.4%
l 2
7.4%
o 2
7.4%
u 1
 
3.7%
e 1
 
3.7%
Other values (2) 2
7.4%
Uppercase Letter
ValueCountFrequency (%)
O 2
33.3%
G 1
16.7%
N 1
16.7%
P 1
16.7%
C 1
16.7%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33
84.6%
Common 6
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
12.1%
a 4
12.1%
f 4
12.1%
c 3
9.1%
n 2
 
6.1%
r 2
 
6.1%
l 2
 
6.1%
o 2
 
6.1%
O 2
 
6.1%
u 1
 
3.0%
Other values (7) 7
21.2%
Common
ValueCountFrequency (%)
5
83.3%
, 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5
12.8%
i 4
10.3%
a 4
10.3%
f 4
10.3%
c 3
 
7.7%
n 2
 
5.1%
r 2
 
5.1%
l 2
 
5.1%
o 2
 
5.1%
O 2
 
5.1%
Other values (9) 9
23.1%

fieldNumber
Text

Missing 

Distinct62652
Distinct (%)10.7%
Missing1339759
Missing (%)69.5%
Memory size14.7 MiB
2025-01-08T17:52:43.789031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length111
Median length63
Mean length13.61565474
Min length1

Characters and Unicode

Total characters7987406
Distinct characters82
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27490 ?
Unique (%)4.7%

Sample

1st rowMMS-CABP/02B-E4
2nd row4/III-23-TDS
3rd rowUSARP/EL/12/1002/USC
4th rowUSFC/A2059
5th rowUSFC/A5374
ValueCountFrequency (%)
mms-mafla/jar 17292
 
2.6%
bolland/rfb 7605
 
1.1%
humes 5243
 
0.8%
jpem 5029
 
0.8%
4975
 
0.8%
rh 2306
 
0.3%
k-rh 1557
 
0.2%
spm 1164
 
0.2%
mnhn-norfolk 1131
 
0.2%
haul 1040
 
0.2%
Other values (59086) 614438
92.8%
2025-01-08T17:52:44.053046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 742746
 
9.3%
S 650690
 
8.1%
M 501374
 
6.3%
- 480058
 
6.0%
A 421866
 
5.3%
1 403237
 
5.0%
0 377832
 
4.7%
C 368160
 
4.6%
2 360968
 
4.5%
U 266532
 
3.3%
Other values (72) 3413943
42.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3902312
48.9%
Decimal Number 2536931
31.8%
Other Punctuation 835673
 
10.5%
Dash Punctuation 480058
 
6.0%
Lowercase Letter 145893
 
1.8%
Space Separator 75146
 
0.9%
Connector Punctuation 7573
 
0.1%
Open Punctuation 1756
 
< 0.1%
Close Punctuation 1756
 
< 0.1%
Math Symbol 302
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 650690
16.7%
M 501374
12.8%
A 421866
10.8%
C 368160
9.4%
U 266532
 
6.8%
F 236190
 
6.1%
I 186859
 
4.8%
R 170619
 
4.4%
L 169981
 
4.4%
P 165609
 
4.2%
Other values (16) 764432
19.6%
Lowercase Letter
ValueCountFrequency (%)
e 25305
17.3%
r 24955
17.1%
a 23105
15.8%
l 9450
 
6.5%
s 8105
 
5.6%
i 7884
 
5.4%
o 7864
 
5.4%
u 7559
 
5.2%
m 5786
 
4.0%
t 4694
 
3.2%
Other values (16) 21186
14.5%
Other Punctuation
ValueCountFrequency (%)
/ 742746
88.9%
: 80860
 
9.7%
. 4233
 
0.5%
; 3671
 
0.4%
, 2634
 
0.3%
# 938
 
0.1%
\ 340
 
< 0.1%
? 150
 
< 0.1%
& 61
 
< 0.1%
" 16
 
< 0.1%
Other values (2) 24
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 403237
15.9%
0 377832
14.9%
2 360968
14.2%
5 260750
10.3%
3 252386
9.9%
4 217309
8.6%
7 192322
7.6%
6 178170
7.0%
8 164692
6.5%
9 129265
 
5.1%
Math Symbol
ValueCountFrequency (%)
+ 290
96.0%
= 12
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 480058
100.0%
Space Separator
ValueCountFrequency (%)
75146
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7573
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1756
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1756
100.0%
Control
ValueCountFrequency (%)
 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4048205
50.7%
Common 3939201
49.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 650690
16.1%
M 501374
12.4%
A 421866
10.4%
C 368160
9.1%
U 266532
 
6.6%
F 236190
 
5.8%
I 186859
 
4.6%
R 170619
 
4.2%
L 169981
 
4.2%
P 165609
 
4.1%
Other values (42) 910325
22.5%
Common
ValueCountFrequency (%)
/ 742746
18.9%
- 480058
12.2%
1 403237
10.2%
0 377832
9.6%
2 360968
9.2%
5 260750
 
6.6%
3 252386
 
6.4%
4 217309
 
5.5%
7 192322
 
4.9%
6 178170
 
4.5%
Other values (20) 473423
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7987406
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 742746
 
9.3%
S 650690
 
8.1%
M 501374
 
6.3%
- 480058
 
6.0%
A 421866
 
5.3%
1 403237
 
5.0%
0 377832
 
4.7%
C 368160
 
4.6%
2 360968
 
4.5%
U 266532
 
3.3%
Other values (72) 3413943
42.7%

eventDate
Text

Missing 

Distinct45561
Distinct (%)3.7%
Missing688611
Missing (%)35.7%
Memory size14.7 MiB
2025-01-08T17:52:44.253898image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.825816662
Min length4

Characters and Unicode

Total characters12162219
Distinct characters17
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6824 ?
Unique (%)0.6%

Sample

1st row1976-03-03
2nd row1984-05-15
3rd row1964-03-15
4th row1883-08-31
5th row1909-03-02
ValueCountFrequency (%)
1915 6254
 
0.5%
1982-07-21 5684
 
0.5%
1981-07-06 5412
 
0.4%
1983-05-13 5155
 
0.4%
1982-11-19 5039
 
0.4%
1982-02-10 4461
 
0.4%
1981-11-09 4297
 
0.3%
1913 4293
 
0.3%
1982-05-10 4269
 
0.3%
1977-01-28/1977-02-13 3795
 
0.3%
Other values (45551) 1189123
96.1%
2025-01-08T17:52:44.503810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2343420
19.3%
- 2329130
19.2%
0 1804499
14.8%
9 1499550
12.3%
2 828832
 
6.8%
8 778911
 
6.4%
7 716498
 
5.9%
6 564568
 
4.6%
5 436405
 
3.6%
3 431150
 
3.5%
Other values (7) 429256
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9788344
80.5%
Dash Punctuation 2329130
 
19.2%
Other Punctuation 44740
 
0.4%
Lowercase Letter 4
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2343420
23.9%
0 1804499
18.4%
9 1499550
15.3%
2 828832
 
8.5%
8 778911
 
8.0%
7 716498
 
7.3%
6 564568
 
5.8%
5 436405
 
4.5%
3 431150
 
4.4%
4 384511
 
3.9%
Lowercase Letter
ValueCountFrequency (%)
e 1
25.0%
x 1
25.0%
a 1
25.0%
s 1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 2329130
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 44740
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12162214
> 99.9%
Latin 5
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2343420
19.3%
- 2329130
19.2%
0 1804499
14.8%
9 1499550
12.3%
2 828832
 
6.8%
8 778911
 
6.4%
7 716498
 
5.9%
6 564568
 
4.6%
5 436405
 
3.6%
3 431150
 
3.5%
Other values (2) 429251
 
3.5%
Latin
ValueCountFrequency (%)
T 1
20.0%
e 1
20.0%
x 1
20.0%
a 1
20.0%
s 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12162219
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2343420
19.3%
- 2329130
19.2%
0 1804499
14.8%
9 1499550
12.3%
2 828832
 
6.8%
8 778911
 
6.4%
7 716498
 
5.9%
6 564568
 
4.6%
5 436405
 
3.6%
3 431150
 
3.5%
Other values (7) 429256
 
3.5%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)< 0.1%
Missing842313
Missing (%)43.7%
Memory size14.7 MiB
2025-01-08T17:52:44.700683image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.737319202
Min length1

Characters and Unicode

Total characters2967473
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row63
2nd row136
3rd row75
4th row243
5th row61
ValueCountFrequency (%)
202 9215
 
0.9%
133 9048
 
0.8%
187 8343
 
0.8%
130 7952
 
0.7%
323 7925
 
0.7%
41 7863
 
0.7%
145 7055
 
0.7%
313 6543
 
0.6%
175 6524
 
0.6%
263 6356
 
0.6%
Other values (356) 1007256
92.9%
2025-01-08T17:52:44.941401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 581036
19.6%
2 554744
18.7%
3 415233
14.0%
4 233074
7.9%
5 216424
 
7.3%
0 206942
 
7.0%
6 200957
 
6.8%
9 195331
 
6.6%
7 188977
 
6.4%
8 174755
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2967473
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 581036
19.6%
2 554744
18.7%
3 415233
14.0%
4 233074
7.9%
5 216424
 
7.3%
0 206942
 
7.0%
6 200957
 
6.8%
9 195331
 
6.6%
7 188977
 
6.4%
8 174755
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common 2967473
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 581036
19.6%
2 554744
18.7%
3 415233
14.0%
4 233074
7.9%
5 216424
 
7.3%
0 206942
 
7.0%
6 200957
 
6.8%
9 195331
 
6.6%
7 188977
 
6.4%
8 174755
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2967473
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 581036
19.6%
2 554744
18.7%
3 415233
14.0%
4 233074
7.9%
5 216424
 
7.3%
0 206942
 
7.0%
6 200957
 
6.8%
9 195331
 
6.6%
7 188977
 
6.4%
8 174755
 
5.9%

endDayOfYear
Text

Missing 

Distinct368
Distinct (%)< 0.1%
Missing842311
Missing (%)43.7%
Memory size14.7 MiB
2025-01-08T17:52:45.139558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length3
Mean length2.738167408
Min length1

Characters and Unicode

Total characters2968398
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row63
2nd row136
3rd row75
4th row243
5th row61
ValueCountFrequency (%)
202 9184
 
0.8%
133 9037
 
0.8%
187 8347
 
0.8%
41 7969
 
0.7%
323 7925
 
0.7%
130 7869
 
0.7%
153 7303
 
0.7%
313 6544
 
0.6%
191 6380
 
0.6%
44 6227
 
0.6%
Other values (360) 1007299
92.9%
2025-01-08T17:52:45.375786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 585916
19.7%
2 551639
18.6%
3 416638
14.0%
4 236593
8.0%
5 218340
 
7.4%
0 208775
 
7.0%
6 195661
 
6.6%
9 190870
 
6.4%
7 187463
 
6.3%
8 176487
 
5.9%
Other values (10) 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2968382
> 99.9%
Lowercase Letter 10
 
< 0.1%
Uppercase Letter 4
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 585916
19.7%
2 551639
18.6%
3 416638
14.0%
4 236593
8.0%
5 218340
 
7.4%
0 208775
 
7.0%
6 195661
 
6.6%
9 190870
 
6.4%
7 187463
 
6.3%
8 176487
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
a 4
40.0%
e 2
20.0%
z 1
 
10.0%
g 1
 
10.0%
l 1
 
10.0%
k 1
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
L 2
50.0%
P 1
25.0%
E 1
25.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2968384
> 99.9%
Latin 14
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 585916
19.7%
2 551639
18.6%
3 416638
14.0%
4 236593
8.0%
5 218340
 
7.4%
0 208775
 
7.0%
6 195661
 
6.6%
9 190870
 
6.4%
7 187463
 
6.3%
8 176487
 
5.9%
Latin
ValueCountFrequency (%)
a 4
28.6%
e 2
14.3%
L 2
14.3%
P 1
 
7.1%
z 1
 
7.1%
E 1
 
7.1%
g 1
 
7.1%
l 1
 
7.1%
k 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2968398
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 585916
19.7%
2 551639
18.6%
3 416638
14.0%
4 236593
8.0%
5 218340
 
7.4%
0 208775
 
7.0%
6 195661
 
6.6%
9 190870
 
6.4%
7 187463
 
6.3%
8 176487
 
5.9%
Other values (10) 16
 
< 0.1%

year
Text

Missing 

Distinct207
Distinct (%)< 0.1%
Missing689273
Missing (%)35.8%
Memory size14.7 MiB
2025-01-08T17:52:45.541901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4948480
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row1976
2nd row1984
3rd row1964
4th row1883
5th row1909
ValueCountFrequency (%)
1977 73835
 
6.0%
1981 43749
 
3.5%
1976 42199
 
3.4%
1984 38196
 
3.1%
1982 38145
 
3.1%
1908 35299
 
2.9%
1983 34031
 
2.8%
1985 30482
 
2.5%
1964 28236
 
2.3%
1975 25013
 
2.0%
Other values (197) 847935
68.5%
2025-01-08T17:52:45.763820image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1361026
27.5%
9 1244662
25.2%
8 523421
 
10.6%
7 428827
 
8.7%
6 322906
 
6.5%
0 305534
 
6.2%
2 219318
 
4.4%
5 194130
 
3.9%
4 177446
 
3.6%
3 171210
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4948480
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1361026
27.5%
9 1244662
25.2%
8 523421
 
10.6%
7 428827
 
8.7%
6 322906
 
6.5%
0 305534
 
6.2%
2 219318
 
4.4%
5 194130
 
3.9%
4 177446
 
3.6%
3 171210
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common 4948480
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1361026
27.5%
9 1244662
25.2%
8 523421
 
10.6%
7 428827
 
8.7%
6 322906
 
6.5%
0 305534
 
6.2%
2 219318
 
4.4%
5 194130
 
3.9%
4 177446
 
3.6%
3 171210
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4948480
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1361026
27.5%
9 1244662
25.2%
8 523421
 
10.6%
7 428827
 
8.7%
6 322906
 
6.5%
0 305534
 
6.2%
2 219318
 
4.4%
5 194130
 
3.9%
4 177446
 
3.6%
3 171210
 
3.5%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing800939
Missing (%)41.6%
Memory size14.7 MiB
2025-01-08T17:52:45.822821image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.191259705
Min length1

Characters and Unicode

Total characters1340708
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row5
3rd row3
4th row8
5th row3
ValueCountFrequency (%)
8 129894
11.5%
5 124558
11.1%
7 123176
10.9%
6 104255
9.3%
4 99639
8.9%
11 96677
8.6%
2 95459
8.5%
3 89439
7.9%
9 80447
7.1%
10 66176
5.9%
Other values (2) 115734
10.3%
2025-01-08T17:52:45.920783image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 375264
28.0%
2 147860
 
11.0%
8 129894
 
9.7%
5 124558
 
9.3%
7 123176
 
9.2%
6 104255
 
7.8%
4 99639
 
7.4%
3 89439
 
6.7%
9 80447
 
6.0%
0 66176
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1340708
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 375264
28.0%
2 147860
 
11.0%
8 129894
 
9.7%
5 124558
 
9.3%
7 123176
 
9.2%
6 104255
 
7.8%
4 99639
 
7.4%
3 89439
 
6.7%
9 80447
 
6.0%
0 66176
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1340708
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 375264
28.0%
2 147860
 
11.0%
8 129894
 
9.7%
5 124558
 
9.3%
7 123176
 
9.2%
6 104255
 
7.8%
4 99639
 
7.4%
3 89439
 
6.7%
9 80447
 
6.0%
0 66176
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1340708
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 375264
28.0%
2 147860
 
11.0%
8 129894
 
9.7%
5 124558
 
9.3%
7 123176
 
9.2%
6 104255
 
7.8%
4 99639
 
7.4%
3 89439
 
6.7%
9 80447
 
6.0%
0 66176
 
4.9%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing887053
Missing (%)46.0%
Memory size14.7 MiB
2025-01-08T17:52:45.987626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.70051956
Min length1

Characters and Unicode

Total characters1767418
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row15
3rd row15
4th row31
5th row2
ValueCountFrequency (%)
13 42864
 
4.1%
10 42434
 
4.1%
19 40651
 
3.9%
6 39463
 
3.8%
21 37986
 
3.7%
9 37781
 
3.6%
15 37214
 
3.6%
18 36290
 
3.5%
14 35493
 
3.4%
16 35080
 
3.4%
Other values (21) 654084
62.9%
2025-01-08T17:52:46.108259image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 488527
27.6%
2 412832
23.4%
3 151754
 
8.6%
9 105393
 
6.0%
0 105122
 
5.9%
5 103912
 
5.9%
6 103662
 
5.9%
8 100571
 
5.7%
4 99991
 
5.7%
7 95654
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1767418
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 488527
27.6%
2 412832
23.4%
3 151754
 
8.6%
9 105393
 
6.0%
0 105122
 
5.9%
5 103912
 
5.9%
6 103662
 
5.9%
8 100571
 
5.7%
4 99991
 
5.7%
7 95654
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 1767418
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 488527
27.6%
2 412832
23.4%
3 151754
 
8.6%
9 105393
 
6.0%
0 105122
 
5.9%
5 103912
 
5.9%
6 103662
 
5.9%
8 100571
 
5.7%
4 99991
 
5.7%
7 95654
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1767418
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 488527
27.6%
2 412832
23.4%
3 151754
 
8.6%
9 105393
 
6.0%
0 105122
 
5.9%
5 103912
 
5.9%
6 103662
 
5.9%
8 100571
 
5.7%
4 99991
 
5.7%
7 95654
 
5.4%

verbatimEventDate
Text

Missing 

Distinct47776
Distinct (%)6.3%
Missing1173199
Missing (%)60.9%
Memory size14.7 MiB
2025-01-08T17:52:46.277013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length181
Median length11
Mean length11.01797943
Min length1

Characters and Unicode

Total characters8298676
Distinct characters81
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15837 ?
Unique (%)2.1%

Sample

1st row-- --- ----
2nd row15 MAY 1984
3rd row15 MAR 1964
4th row03 MAR 1967
5th row31 AUG 1958
ValueCountFrequency (%)
275912
 
12.6%
may 68627
 
3.1%
aug 65853
 
3.0%
jul 61532
 
2.8%
apr 57935
 
2.6%
feb 53288
 
2.4%
jun 52783
 
2.4%
nov 52211
 
2.4%
mar 46122
 
2.1%
1977 42132
 
1.9%
Other values (8403) 1419007
64.6%
2025-01-08T17:52:46.531453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1442208
17.4%
1 1077550
13.0%
9 807908
 
9.7%
- 749611
 
9.0%
2 340282
 
4.1%
7 334273
 
4.0%
0 322856
 
3.9%
8 301958
 
3.6%
6 296090
 
3.6%
A 274119
 
3.3%
Other values (71) 2351821
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4021101
48.5%
Uppercase Letter 1821588
22.0%
Space Separator 1442208
 
17.4%
Dash Punctuation 749611
 
9.0%
Lowercase Letter 202119
 
2.4%
Other Punctuation 58056
 
0.7%
Close Punctuation 1860
 
< 0.1%
Open Punctuation 1857
 
< 0.1%
Connector Punctuation 187
 
< 0.1%
Math Symbol 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 23085
11.4%
r 22928
11.3%
l 19199
9.5%
n 18909
9.4%
i 17653
8.7%
a 15944
7.9%
t 14270
7.1%
p 13298
 
6.6%
g 11952
 
5.9%
u 11197
 
5.5%
Other values (15) 33684
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 274119
15.0%
U 176026
 
9.7%
J 155789
 
8.6%
N 143298
 
7.9%
M 120575
 
6.6%
E 116136
 
6.4%
R 101244
 
5.6%
P 93222
 
5.1%
O 88378
 
4.9%
Y 68082
 
3.7%
Other values (14) 484719
26.6%
Other Punctuation
ValueCountFrequency (%)
. 19795
34.1%
/ 15745
27.1%
, 11253
19.4%
: 9409
16.2%
; 983
 
1.7%
? 319
 
0.5%
& 294
 
0.5%
' 244
 
0.4%
" 9
 
< 0.1%
\ 2
 
< 0.1%
Other values (2) 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1077550
26.8%
9 807908
20.1%
2 340282
 
8.5%
7 334273
 
8.3%
0 322856
 
8.0%
8 301958
 
7.5%
6 296090
 
7.4%
3 203568
 
5.1%
5 177563
 
4.4%
4 159053
 
4.0%
Math Symbol
ValueCountFrequency (%)
+ 80
89.9%
~ 8
 
9.0%
< 1
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 1838
98.8%
] 22
 
1.2%
Open Punctuation
ValueCountFrequency (%)
( 1837
98.9%
[ 20
 
1.1%
Space Separator
ValueCountFrequency (%)
1442208
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 749611
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 187
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6274969
75.6%
Latin 2023707
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 274119
 
13.5%
U 176026
 
8.7%
J 155789
 
7.7%
N 143298
 
7.1%
M 120575
 
6.0%
E 116136
 
5.7%
R 101244
 
5.0%
P 93222
 
4.6%
O 88378
 
4.4%
Y 68082
 
3.4%
Other values (39) 686838
33.9%
Common
ValueCountFrequency (%)
1442208
23.0%
1 1077550
17.2%
9 807908
12.9%
- 749611
11.9%
2 340282
 
5.4%
7 334273
 
5.3%
0 322856
 
5.1%
8 301958
 
4.8%
6 296090
 
4.7%
3 203568
 
3.2%
Other values (22) 398665
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8298676
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1442208
17.4%
1 1077550
13.0%
9 807908
 
9.7%
- 749611
 
9.0%
2 340282
 
4.1%
7 334273
 
4.0%
0 322856
 
3.9%
8 301958
 
3.6%
6 296090
 
3.6%
A 274119
 
3.3%
Other values (71) 2351821
28.3%

habitat
Text

Missing 

Distinct18961
Distinct (%)27.4%
Missing1857136
Missing (%)96.4%
Memory size14.7 MiB
2025-01-08T17:52:46.719549image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length235
Median length159
Mean length19.79818646
Min length1

Characters and Unicode

Total characters1371163
Distinct characters89
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13600 ?
Unique (%)19.6%

Sample

1st rowBeach with fresh water creek running into it
2nd rowFreshwater
3rd rowIn sand
4th rowMangrove
5th rowUnder rocks
ValueCountFrequency (%)
freshwater 9208
 
4.1%
in 6886
 
3.1%
on 6374
 
2.8%
reef 6192
 
2.8%
sand 6092
 
2.7%
coral 5812
 
2.6%
of 4886
 
2.2%
rocks 4639
 
2.1%
sp 4290
 
1.9%
intertidal 4238
 
1.9%
Other values (6965) 165798
73.9%
2025-01-08T17:52:47.123420image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
155158
 
11.3%
e 134098
 
9.8%
a 117967
 
8.6%
r 101199
 
7.4%
n 83052
 
6.1%
s 82888
 
6.0%
o 79802
 
5.8%
t 71848
 
5.2%
i 60753
 
4.4%
l 60225
 
4.4%
Other values (79) 424173
30.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1121766
81.8%
Space Separator 155158
 
11.3%
Uppercase Letter 60796
 
4.4%
Other Punctuation 20726
 
1.5%
Decimal Number 6945
 
0.5%
Math Symbol 2493
 
0.2%
Dash Punctuation 1845
 
0.1%
Open Punctuation 719
 
0.1%
Close Punctuation 714
 
0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 134098
12.0%
a 117967
10.5%
r 101199
 
9.0%
n 83052
 
7.4%
s 82888
 
7.4%
o 79802
 
7.1%
t 71848
 
6.4%
i 60753
 
5.4%
l 60225
 
5.4%
d 54663
 
4.9%
Other values (18) 275271
24.5%
Uppercase Letter
ValueCountFrequency (%)
F 12119
19.9%
L 6196
10.2%
S 6181
10.2%
I 5575
9.2%
R 4363
 
7.2%
O 3937
 
6.5%
M 3425
 
5.6%
C 3204
 
5.3%
U 2435
 
4.0%
B 2327
 
3.8%
Other values (16) 11034
18.1%
Other Punctuation
ValueCountFrequency (%)
, 10235
49.4%
. 7694
37.1%
; 838
 
4.0%
/ 686
 
3.3%
' 442
 
2.1%
# 299
 
1.4%
& 196
 
0.9%
: 111
 
0.5%
% 90
 
0.4%
" 75
 
0.4%
Other values (3) 60
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 1217
17.5%
0 1157
16.7%
2 887
12.8%
5 750
10.8%
3 666
9.6%
4 598
8.6%
6 523
7.5%
8 390
 
5.6%
7 387
 
5.6%
9 370
 
5.3%
Math Symbol
ValueCountFrequency (%)
+ 2456
98.5%
= 24
 
1.0%
< 7
 
0.3%
~ 4
 
0.2%
> 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 715
99.4%
[ 4
 
0.6%
Close Punctuation
ValueCountFrequency (%)
) 711
99.6%
] 3
 
0.4%
Space Separator
ValueCountFrequency (%)
155158
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1845
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1182562
86.2%
Common 188601
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 134098
 
11.3%
a 117967
 
10.0%
r 101199
 
8.6%
n 83052
 
7.0%
s 82888
 
7.0%
o 79802
 
6.7%
t 71848
 
6.1%
i 60753
 
5.1%
l 60225
 
5.1%
d 54663
 
4.6%
Other values (44) 336067
28.4%
Common
ValueCountFrequency (%)
155158
82.3%
, 10235
 
5.4%
. 7694
 
4.1%
+ 2456
 
1.3%
- 1845
 
1.0%
1 1217
 
0.6%
0 1157
 
0.6%
2 887
 
0.5%
; 838
 
0.4%
5 750
 
0.4%
Other values (25) 6364
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1371160
> 99.9%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
155158
 
11.3%
e 134098
 
9.8%
a 117967
 
8.6%
r 101199
 
7.4%
n 83052
 
6.1%
s 82888
 
6.0%
o 79802
 
5.8%
t 71848
 
5.2%
i 60753
 
4.4%
l 60225
 
4.4%
Other values (76) 424170
30.9%
None
ValueCountFrequency (%)
é 1
33.3%
° 1
33.3%
ç 1
33.3%

samplingEffort
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:47.172417image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row24.1667
ValueCountFrequency (%)
24.1667 1
100.0%
2025-01-08T17:52:47.255418image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 2
28.6%
2 1
14.3%
4 1
14.3%
. 1
14.3%
1 1
14.3%
7 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
85.7%
Other Punctuation 1
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 2
33.3%
2 1
16.7%
4 1
16.7%
1 1
16.7%
7 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 2
28.6%
2 1
14.3%
4 1
14.3%
. 1
14.3%
1 1
14.3%
7 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 2
28.6%
2 1
14.3%
4 1
14.3%
. 1
14.3%
1 1
14.3%
7 1
14.3%

fieldNotes
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:47.292417image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-110.283
ValueCountFrequency (%)
110.283 1
100.0%
2025-01-08T17:52:47.377447image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
- 1
12.5%
0 1
12.5%
. 1
12.5%
2 1
12.5%
8 1
12.5%
3 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
75.0%
Dash Punctuation 1
 
12.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
33.3%
0 1
16.7%
2 1
16.7%
8 1
16.7%
3 1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
- 1
12.5%
0 1
12.5%
. 1
12.5%
2 1
12.5%
8 1
12.5%
3 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
- 1
12.5%
0 1
12.5%
. 1
12.5%
2 1
12.5%
8 1
12.5%
3 1
12.5%

locationID
Text

Missing 

Distinct94703
Distinct (%)10.0%
Missing984066
Missing (%)51.1%
Memory size14.7 MiB
2025-01-08T17:52:47.568782image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37768
Median length134
Mean length4.4719158
Min length1

Characters and Unicode

Total characters4214007
Distinct characters95
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52904 ?
Unique (%)5.6%

Sample

1st rowE4
2nd rowNR 12-4 ID 101
3rd row23
4th row1002
5th row2059
ValueCountFrequency (%)
not 12392
 
1.2%
rec 12070
 
1.2%
4 8476
 
0.8%
rhb 7696
 
0.7%
rfb 7623
 
0.7%
1 7614
 
0.7%
2 6232
 
0.6%
3 5496
 
0.5%
gs 5168
 
0.5%
6 5011
 
0.5%
Other values (80921) 965661
92.5%
2025-01-08T17:52:47.837634image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 474584
 
11.3%
2 394541
 
9.4%
0 331952
 
7.9%
5 296061
 
7.0%
3 287737
 
6.8%
4 264333
 
6.3%
- 262376
 
6.2%
6 216672
 
5.1%
7 190959
 
4.5%
8 180969
 
4.3%
Other values (85) 1313823
31.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2803373
66.5%
Uppercase Letter 884393
 
21.0%
Dash Punctuation 262384
 
6.2%
Space Separator 99106
 
2.4%
Other Punctuation 75953
 
1.8%
Lowercase Letter 66356
 
1.6%
Connector Punctuation 8295
 
0.2%
Control 6660
 
0.2%
Close Punctuation 3380
 
0.1%
Open Punctuation 3373
 
0.1%
Other values (2) 734
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9590
14.5%
o 7889
11.9%
r 7537
11.4%
a 7245
10.9%
i 4117
 
6.2%
t 3885
 
5.9%
l 3658
 
5.5%
n 2898
 
4.4%
c 2771
 
4.2%
s 2638
 
4.0%
Other values (18) 14128
21.3%
Uppercase Letter
ValueCountFrequency (%)
A 92320
 
10.4%
S 79255
 
9.0%
C 72043
 
8.1%
B 66965
 
7.6%
R 60474
 
6.8%
M 57012
 
6.4%
N 52437
 
5.9%
E 48652
 
5.5%
I 45116
 
5.1%
T 37245
 
4.2%
Other values (17) 272874
30.9%
Other Punctuation
ValueCountFrequency (%)
: 37486
49.4%
. 24846
32.7%
, 7340
 
9.7%
/ 3833
 
5.0%
# 1569
 
2.1%
& 288
 
0.4%
; 175
 
0.2%
? 147
 
0.2%
* 124
 
0.2%
' 117
 
0.2%
Other values (4) 28
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 474584
16.9%
2 394541
14.1%
0 331952
11.8%
5 296061
10.6%
3 287737
10.3%
4 264333
9.4%
6 216672
7.7%
7 190959
6.8%
8 180969
 
6.5%
9 165565
 
5.9%
Close Punctuation
ValueCountFrequency (%)
) 3091
91.4%
] 288
 
8.5%
} 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 3084
91.4%
[ 288
 
8.5%
{ 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 262376
> 99.9%
8
 
< 0.1%
Control
ValueCountFrequency (%)
6630
99.5%
30
 
0.5%
Math Symbol
ValueCountFrequency (%)
+ 724
98.9%
= 8
 
1.1%
Other Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
99106
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8295
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3263258
77.4%
Latin 950749
 
22.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 92320
 
9.7%
S 79255
 
8.3%
C 72043
 
7.6%
B 66965
 
7.0%
R 60474
 
6.4%
M 57012
 
6.0%
N 52437
 
5.5%
E 48652
 
5.1%
I 45116
 
4.7%
T 37245
 
3.9%
Other values (45) 339230
35.7%
Common
ValueCountFrequency (%)
1 474584
14.5%
2 394541
12.1%
0 331952
10.2%
5 296061
9.1%
3 287737
8.8%
4 264333
8.1%
- 262376
8.0%
6 216672
6.6%
7 190959
5.9%
8 180969
 
5.5%
Other values (30) 363074
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4213988
> 99.9%
None 11
 
< 0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 474584
 
11.3%
2 394541
 
9.4%
0 331952
 
7.9%
5 296061
 
7.0%
3 287737
 
6.8%
4 264333
 
6.3%
- 262376
 
6.2%
6 216672
 
5.1%
7 190959
 
4.5%
8 180969
 
4.3%
Other values (79) 1313804
31.2%
Punctuation
ValueCountFrequency (%)
8
100.0%
None
ValueCountFrequency (%)
ö 6
54.5%
é 2
 
18.2%
1
 
9.1%
É 1
 
9.1%
1
 
9.1%

higherGeography
Text

Missing 

Distinct12370
Distinct (%)0.7%
Missing67831
Missing (%)3.5%
Memory size14.7 MiB
2025-01-08T17:52:48.019413image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length126
Median length104
Mean length36.17342494
Min length4

Characters and Unicode

Total characters67230553
Distinct characters77
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3190 ?
Unique (%)0.2%

Sample

1st rowNorth Atlantic Ocean, United States
2nd rowNorth Atlantic Ocean, Gulf of Mexico, United States, Florida
3rd rowNorth Atlantic Ocean, Caribbean Sea, Barbados
4th rowNorth Atlantic Ocean, Gulf of Mexico, United States, Florida
5th rowPhilippines
ValueCountFrequency (%)
ocean 1259909
 
13.4%
north 1098149
 
11.7%
united 886190
 
9.4%
states 871608
 
9.3%
atlantic 718309
 
7.7%
pacific 437003
 
4.7%
mexico 248368
 
2.6%
of 243369
 
2.6%
gulf 228771
 
2.4%
south 203325
 
2.2%
Other values (4652) 3191450
34.0%
2025-01-08T17:52:48.267804image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7527889
 
11.2%
a 6865365
 
10.2%
t 6256807
 
9.3%
i 4780196
 
7.1%
e 4733947
 
7.0%
n 4584442
 
6.8%
c 3760391
 
5.6%
o 2897132
 
4.3%
, 2857287
 
4.2%
r 2272065
 
3.4%
Other values (67) 20695032
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47723213
71.0%
Uppercase Letter 9110940
 
13.6%
Space Separator 7527889
 
11.2%
Other Punctuation 2867453
 
4.3%
Dash Punctuation 1038
 
< 0.1%
Open Punctuation 10
 
< 0.1%
Close Punctuation 10
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6865365
14.4%
t 6256807
13.1%
i 4780196
10.0%
e 4733947
9.9%
n 4584442
9.6%
c 3760391
7.9%
o 2897132
 
6.1%
r 2272065
 
4.8%
s 2141009
 
4.5%
l 1955194
 
4.1%
Other values (28) 7476665
15.7%
Uppercase Letter
ValueCountFrequency (%)
S 1397107
15.3%
O 1301413
14.3%
N 1193009
13.1%
A 1063860
11.7%
U 893936
9.8%
P 682206
7.5%
C 555491
 
6.1%
M 514537
 
5.6%
G 305981
 
3.4%
F 216163
 
2.4%
Other values (17) 987237
10.8%
Other Punctuation
ValueCountFrequency (%)
, 2857287
99.6%
. 7750
 
0.3%
' 2246
 
0.1%
? 153
 
< 0.1%
& 11
 
< 0.1%
/ 6
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 8
80.0%
[ 2
 
20.0%
Close Punctuation
ValueCountFrequency (%)
) 8
80.0%
] 2
 
20.0%
Space Separator
ValueCountFrequency (%)
7527889
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1038
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56834153
84.5%
Common 10396400
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6865365
12.1%
t 6256807
 
11.0%
i 4780196
 
8.4%
e 4733947
 
8.3%
n 4584442
 
8.1%
c 3760391
 
6.6%
o 2897132
 
5.1%
r 2272065
 
4.0%
s 2141009
 
3.8%
l 1955194
 
3.4%
Other values (55) 16587605
29.2%
Common
ValueCountFrequency (%)
7527889
72.4%
, 2857287
 
27.5%
. 7750
 
0.1%
' 2246
 
< 0.1%
- 1038
 
< 0.1%
? 153
 
< 0.1%
& 11
 
< 0.1%
( 8
 
< 0.1%
) 8
 
< 0.1%
/ 6
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67229602
> 99.9%
None 951
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7527889
 
11.2%
a 6865365
 
10.2%
t 6256807
 
9.3%
i 4780196
 
7.1%
e 4733947
 
7.0%
n 4584442
 
6.8%
c 3760391
 
5.6%
o 2897132
 
4.3%
, 2857287
 
4.3%
r 2272065
 
3.4%
Other values (54) 20694081
30.8%
None
ValueCountFrequency (%)
ç 434
45.6%
í 144
 
15.1%
é 141
 
14.8%
ó 110
 
11.6%
á 100
 
10.5%
ê 7
 
0.7%
è 6
 
0.6%
ô 3
 
0.3%
ü 2
 
0.2%
ñ 1
 
0.1%
Other values (3) 3
 
0.3%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing1027391
Missing (%)53.3%
Memory size14.7 MiB
2025-01-08T17:52:48.324801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length9.980899931
Min length4

Characters and Unicode

Total characters8972849
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowASIA
3rd rowNORTH_AMERICA
4th rowOCEANIA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 475004
52.8%
oceania 155883
 
17.3%
asia 135716
 
15.1%
south_america 44254
 
4.9%
africa 39371
 
4.4%
europe 33879
 
3.8%
antarctica 14895
 
1.7%
2025-01-08T17:52:48.420141image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1745141
19.4%
R 1082407
12.1%
I 865123
9.6%
C 744302
8.3%
E 742899
8.3%
O 709020
7.9%
N 645782
 
7.2%
T 549048
 
6.1%
H 519258
 
5.8%
_ 519258
 
5.8%
Other values (5) 850611
9.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 8453591
94.2%
Connector Punctuation 519258
 
5.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1745141
20.6%
R 1082407
12.8%
I 865123
10.2%
C 744302
8.8%
E 742899
8.8%
O 709020
8.4%
N 645782
 
7.6%
T 549048
 
6.5%
H 519258
 
6.1%
M 519258
 
6.1%
Other values (4) 331353
 
3.9%
Connector Punctuation
ValueCountFrequency (%)
_ 519258
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8453591
94.2%
Common 519258
 
5.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1745141
20.6%
R 1082407
12.8%
I 865123
10.2%
C 744302
8.8%
E 742899
8.8%
O 709020
8.4%
N 645782
 
7.6%
T 549048
 
6.5%
H 519258
 
6.1%
M 519258
 
6.1%
Other values (4) 331353
 
3.9%
Common
ValueCountFrequency (%)
_ 519258
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8972849
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1745141
19.4%
R 1082407
12.1%
I 865123
9.6%
C 744302
8.3%
E 742899
8.3%
O 709020
7.9%
N 645782
 
7.2%
T 549048
 
6.1%
H 519258
 
5.8%
_ 519258
 
5.8%
Other values (5) 850611
9.5%

waterBody
Text

Missing 

Distinct1655
Distinct (%)0.1%
Missing666651
Missing (%)34.6%
Memory size14.7 MiB
2025-01-08T17:52:48.584685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length76
Median length75
Mean length24.49184833
Min length7

Characters and Unicode

Total characters30853410
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique510 ?
Unique (%)< 0.1%

Sample

1st rowNorth Atlantic Ocean
2nd rowNorth Atlantic Ocean, Gulf of Mexico
3rd rowNorth Atlantic Ocean, Caribbean Sea
4th rowNorth Atlantic Ocean, Gulf of Mexico
5th rowAntarctic Ocean
ValueCountFrequency (%)
ocean 1259434
26.1%
north 998553
20.7%
atlantic 718247
14.9%
pacific 436962
 
9.1%
of 231313
 
4.8%
gulf 228638
 
4.7%
sea 193896
 
4.0%
mexico 187756
 
3.9%
south 160377
 
3.3%
caribbean 89358
 
1.9%
Other values (1319) 318010
 
6.6%
2025-01-08T17:52:48.817054image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3562802
11.5%
c 3175906
10.3%
a 3113538
 
10.1%
t 2738941
 
8.9%
n 2331622
 
7.6%
i 2082746
 
6.8%
e 1823700
 
5.9%
o 1648330
 
5.3%
O 1261125
 
4.1%
r 1218140
 
3.9%
Other values (53) 7896560
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22247025
72.1%
Uppercase Letter 4591399
 
14.9%
Space Separator 3562802
 
11.5%
Other Punctuation 451904
 
1.5%
Dash Punctuation 276
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 3175906
14.3%
a 3113538
14.0%
t 2738941
12.3%
n 2331622
10.5%
i 2082746
9.4%
e 1823700
8.2%
o 1648330
7.4%
r 1218140
 
5.5%
h 1180475
 
5.3%
l 988529
 
4.4%
Other values (20) 1945098
8.7%
Uppercase Letter
ValueCountFrequency (%)
O 1261125
27.5%
N 1000300
21.8%
A 784279
17.1%
P 450444
 
9.8%
S 386552
 
8.4%
G 231863
 
5.0%
M 210774
 
4.6%
C 120701
 
2.6%
B 53751
 
1.2%
I 51181
 
1.1%
Other values (15) 40429
 
0.9%
Other Punctuation
ValueCountFrequency (%)
, 451309
99.9%
. 465
 
0.1%
' 117
 
< 0.1%
? 13
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3562802
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 276
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%
Close Punctuation
ValueCountFrequency (%)
] 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26838424
87.0%
Common 4014986
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 3175906
11.8%
a 3113538
11.6%
t 2738941
10.2%
n 2331622
 
8.7%
i 2082746
 
7.8%
e 1823700
 
6.8%
o 1648330
 
6.1%
O 1261125
 
4.7%
r 1218140
 
4.5%
h 1180475
 
4.4%
Other values (45) 6263901
23.3%
Common
ValueCountFrequency (%)
3562802
88.7%
, 451309
 
11.2%
. 465
 
< 0.1%
- 276
 
< 0.1%
' 117
 
< 0.1%
? 13
 
< 0.1%
[ 2
 
< 0.1%
] 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30853307
> 99.9%
None 103
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3562802
11.5%
c 3175906
10.3%
a 3113538
 
10.1%
t 2738941
 
8.9%
n 2331622
 
7.6%
i 2082746
 
6.8%
e 1823700
 
5.9%
o 1648330
 
5.3%
O 1261125
 
4.1%
r 1218140
 
3.9%
Other values (49) 7896457
25.6%
None
ValueCountFrequency (%)
í 48
46.6%
á 46
44.7%
ó 6
 
5.8%
è 3
 
2.9%

islandGroup
Text

Missing 

Distinct20
Distinct (%)2.6%
Missing1925623
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:48.877453image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length14.52857143
Min length5

Characters and Unicode

Total characters11187
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.8%

Sample

1st rowSociety Islands
2nd rowSociety Islands
3rd rowSociety Islands
4th rowSociety Islands
5th rowSociety Islands
ValueCountFrequency (%)
islands 707
47.0%
society 679
45.2%
exuma 20
 
1.3%
south 12
 
0.8%
sandwich 12
 
0.8%
florida 10
 
0.7%
keys 10
 
0.7%
pacific 10
 
0.7%
carolina 8
 
0.5%
aleutian 7
 
0.5%
Other values (14) 28
 
1.9%
2025-01-08T17:52:48.990759image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1446
12.9%
a 803
 
7.2%
l 751
 
6.7%
n 748
 
6.7%
i 743
 
6.6%
d 738
 
6.6%
733
 
6.6%
o 722
 
6.5%
c 713
 
6.4%
e 711
 
6.4%
Other values (25) 3079
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8951
80.0%
Uppercase Letter 1503
 
13.4%
Space Separator 733
 
6.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1446
16.2%
a 803
9.0%
l 751
8.4%
n 748
8.4%
i 743
8.3%
d 738
8.2%
o 722
8.1%
c 713
8.0%
e 711
7.9%
t 699
7.8%
Other values (11) 877
9.8%
Uppercase Letter
ValueCountFrequency (%)
I 710
47.2%
S 703
46.8%
E 21
 
1.4%
C 16
 
1.1%
P 12
 
0.8%
F 10
 
0.7%
K 10
 
0.7%
A 7
 
0.5%
M 6
 
0.4%
R 2
 
0.1%
Other values (3) 6
 
0.4%
Space Separator
ValueCountFrequency (%)
733
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10454
93.4%
Common 733
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1446
13.8%
a 803
 
7.7%
l 751
 
7.2%
n 748
 
7.2%
i 743
 
7.1%
d 738
 
7.1%
o 722
 
6.9%
c 713
 
6.8%
e 711
 
6.8%
I 710
 
6.8%
Other values (24) 2369
22.7%
Common
ValueCountFrequency (%)
733
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11187
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1446
12.9%
a 803
 
7.2%
l 751
 
6.7%
n 748
 
6.7%
i 743
 
6.6%
d 738
 
6.6%
733
 
6.6%
o 722
 
6.5%
c 713
 
6.4%
e 711
 
6.4%
Other values (25) 3079
27.5%

island
Text

Missing 

Distinct58
Distinct (%)5.9%
Missing1925415
Missing (%)99.9%
Memory size14.7 MiB
2025-01-08T17:52:49.074006image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length6
Mean length6.676891616
Min length4

Characters and Unicode

Total characters6530
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)3.4%

Sample

1st rowMoorea
2nd rowMoorea
3rd rowShikoku
4th rowOahu
5th rowMoorea
ValueCountFrequency (%)
moorea 674
60.4%
oahu 147
 
13.2%
island 91
 
8.2%
great 20
 
1.8%
exuma 20
 
1.8%
nunivak 13
 
1.2%
eniwetok 13
 
1.2%
bonaire 11
 
1.0%
key 10
 
0.9%
west 10
 
0.9%
Other values (58) 106
 
9.5%
2025-01-08T17:52:49.209469image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 1430
21.9%
a 1060
16.2%
e 771
11.8%
r 737
11.3%
M 683
10.5%
u 225
 
3.4%
n 186
 
2.8%
h 170
 
2.6%
O 154
 
2.4%
137
 
2.1%
Other values (39) 977
15.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5279
80.8%
Uppercase Letter 1113
 
17.0%
Space Separator 137
 
2.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1430
27.1%
a 1060
20.1%
e 771
14.6%
r 737
14.0%
u 225
 
4.3%
n 186
 
3.5%
h 170
 
3.2%
s 121
 
2.3%
d 107
 
2.0%
l 105
 
2.0%
Other values (16) 367
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
M 683
61.4%
O 154
 
13.8%
I 90
 
8.1%
E 35
 
3.1%
G 23
 
2.1%
K 21
 
1.9%
N 19
 
1.7%
S 19
 
1.7%
B 17
 
1.5%
R 11
 
1.0%
Other values (11) 41
 
3.7%
Space Separator
ValueCountFrequency (%)
137
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6392
97.9%
Common 138
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1430
22.4%
a 1060
16.6%
e 771
12.1%
r 737
11.5%
M 683
10.7%
u 225
 
3.5%
n 186
 
2.9%
h 170
 
2.7%
O 154
 
2.4%
s 121
 
1.9%
Other values (37) 855
13.4%
Common
ValueCountFrequency (%)
137
99.3%
. 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6528
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1430
21.9%
a 1060
16.2%
e 771
11.8%
r 737
11.3%
M 683
10.5%
u 225
 
3.4%
n 186
 
2.8%
h 170
 
2.6%
O 154
 
2.4%
137
 
2.1%
Other values (38) 975
14.9%
None
ValueCountFrequency (%)
á 2
100.0%

countryCode
Text

Missing 

Distinct239
Distinct (%)< 0.1%
Missing110759
Missing (%)5.7%
Memory size14.7 MiB
2025-01-08T17:52:49.363669image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3631268
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowBB
4th rowUS
5th rowPH
ValueCountFrequency (%)
us 868583
47.8%
ph 93802
 
5.2%
mx 59371
 
3.3%
pa 46369
 
2.6%
aq 44802
 
2.5%
jp 38538
 
2.1%
cu 30147
 
1.7%
ca 28674
 
1.6%
jm 27586
 
1.5%
pf 27226
 
1.5%
Other values (229) 550536
30.3%
2025-01-08T17:52:49.564760image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 948161
26.1%
S 926976
25.5%
P 250779
 
6.9%
A 177982
 
4.9%
M 160911
 
4.4%
H 143259
 
3.9%
C 133182
 
3.7%
B 95322
 
2.6%
J 78390
 
2.2%
G 66596
 
1.8%
Other values (16) 649710
17.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3631268
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 948161
26.1%
S 926976
25.5%
P 250779
 
6.9%
A 177982
 
4.9%
M 160911
 
4.4%
H 143259
 
3.9%
C 133182
 
3.7%
B 95322
 
2.6%
J 78390
 
2.2%
G 66596
 
1.8%
Other values (16) 649710
17.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 3631268
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 948161
26.1%
S 926976
25.5%
P 250779
 
6.9%
A 177982
 
4.9%
M 160911
 
4.4%
H 143259
 
3.9%
C 133182
 
3.7%
B 95322
 
2.6%
J 78390
 
2.2%
G 66596
 
1.8%
Other values (16) 649710
17.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3631268
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 948161
26.1%
S 926976
25.5%
P 250779
 
6.9%
A 177982
 
4.9%
M 160911
 
4.4%
H 143259
 
3.9%
C 133182
 
3.7%
B 95322
 
2.6%
J 78390
 
2.2%
G 66596
 
1.8%
Other values (16) 649710
17.9%

stateProvince
Text

Missing 

Distinct1326
Distinct (%)0.1%
Missing943673
Missing (%)49.0%
Memory size14.7 MiB
2025-01-08T17:52:49.737537image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length51
Median length39
Mean length9.182679705
Min length3

Characters and Unicode

Total characters9024003
Distinct characters70
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique281 ?
Unique (%)< 0.1%

Sample

1st rowFlorida
2nd rowFlorida
3rd rowMassachusetts
4th rowQuezon
5th rowNewfoundland
ValueCountFrequency (%)
florida 157981
 
13.1%
massachusetts 103383
 
8.6%
california 57085
 
4.7%
carolina 53929
 
4.5%
texas 43591
 
3.6%
alaska 41859
 
3.5%
north 31994
 
2.7%
louisiana 28645
 
2.4%
hawaii 26401
 
2.2%
south 26211
 
2.2%
Other values (1250) 635019
52.7%
2025-01-08T17:52:49.980905image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1427949
15.8%
i 809015
 
9.0%
s 773254
 
8.6%
o 650882
 
7.2%
r 519439
 
5.8%
l 506660
 
5.6%
n 498668
 
5.5%
e 457618
 
5.1%
t 400633
 
4.4%
u 277325
 
3.1%
Other values (60) 2702560
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7611777
84.4%
Uppercase Letter 1183253
 
13.1%
Space Separator 223378
 
2.5%
Other Punctuation 5089
 
0.1%
Dash Punctuation 489
 
< 0.1%
Open Punctuation 8
 
< 0.1%
Close Punctuation 8
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1427949
18.8%
i 809015
10.6%
s 773254
10.2%
o 650882
8.6%
r 519439
 
6.8%
l 506660
 
6.7%
n 498668
 
6.6%
e 457618
 
6.0%
t 400633
 
5.3%
u 277325
 
3.6%
Other values (24) 1290334
17.0%
Uppercase Letter
ValueCountFrequency (%)
M 171151
14.5%
C 165212
14.0%
F 164699
13.9%
A 80869
 
6.8%
N 78781
 
6.7%
T 76132
 
6.4%
S 72421
 
6.1%
I 44681
 
3.8%
G 38393
 
3.2%
L 36085
 
3.0%
Other values (17) 254829
21.5%
Other Punctuation
ValueCountFrequency (%)
, 4593
90.3%
. 302
 
5.9%
' 148
 
2.9%
? 46
 
0.9%
Space Separator
ValueCountFrequency (%)
223378
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 489
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8795030
97.5%
Common 228973
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1427949
16.2%
i 809015
 
9.2%
s 773254
 
8.8%
o 650882
 
7.4%
r 519439
 
5.9%
l 506660
 
5.8%
n 498668
 
5.7%
e 457618
 
5.2%
t 400633
 
4.6%
u 277325
 
3.2%
Other values (51) 2473587
28.1%
Common
ValueCountFrequency (%)
223378
97.6%
, 4593
 
2.0%
- 489
 
0.2%
. 302
 
0.1%
' 148
 
0.1%
? 46
 
< 0.1%
( 8
 
< 0.1%
) 8
 
< 0.1%
| 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9023619
> 99.9%
None 384
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1427949
15.8%
i 809015
 
9.0%
s 773254
 
8.6%
o 650882
 
7.2%
r 519439
 
5.8%
l 506660
 
5.6%
n 498668
 
5.5%
e 457618
 
5.1%
t 400633
 
4.4%
u 277325
 
3.1%
Other values (51) 2702176
29.9%
None
ValueCountFrequency (%)
é 123
32.0%
ó 101
26.3%
í 96
25.0%
á 52
13.5%
ê 7
 
1.8%
è 2
 
0.5%
Ñ 1
 
0.3%
ú 1
 
0.3%
ô 1
 
0.3%

county
Text

Missing 

Distinct2594
Distinct (%)1.9%
Missing1786420
Missing (%)92.7%
Memory size14.7 MiB
2025-01-08T17:52:50.153267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length43
Mean length14.35974795
Min length3

Characters and Unicode

Total characters2009977
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique558 ?
Unique (%)0.4%

Sample

1st rowCumberland County
2nd rowAllamakee County
3rd rowSt. Lucie County
4th rowDelaware County
5th rowKimble County
ValueCountFrequency (%)
county 135423
45.4%
st 3893
 
1.3%
parish 3203
 
1.1%
monroe 3117
 
1.0%
lucie 2649
 
0.9%
montgomery 2553
 
0.9%
san 2117
 
0.7%
prince 1875
 
0.6%
george's 1763
 
0.6%
jackson 1748
 
0.6%
Other values (2256) 139876
46.9%
2025-01-08T17:52:50.387567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 223770
11.1%
o 216846
10.8%
t 181049
 
9.0%
u 160924
 
8.0%
158244
 
7.9%
C 152414
 
7.6%
y 151819
 
7.6%
e 105735
 
5.3%
a 103265
 
5.1%
r 74023
 
3.7%
Other values (55) 481888
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1547182
77.0%
Uppercase Letter 298415
 
14.8%
Space Separator 158244
 
7.9%
Other Punctuation 5911
 
0.3%
Dash Punctuation 225
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 223770
14.5%
o 216846
14.0%
t 181049
11.7%
u 160924
10.4%
y 151819
9.8%
e 105735
6.8%
a 103265
6.7%
r 74023
 
4.8%
i 55529
 
3.6%
l 50155
 
3.2%
Other values (22) 224067
14.5%
Uppercase Letter
ValueCountFrequency (%)
C 152414
51.1%
M 16357
 
5.5%
S 14112
 
4.7%
L 13053
 
4.4%
P 12734
 
4.3%
B 11994
 
4.0%
G 8960
 
3.0%
W 8635
 
2.9%
A 8280
 
2.8%
D 7831
 
2.6%
Other values (16) 44045
 
14.8%
Other Punctuation
ValueCountFrequency (%)
. 3891
65.8%
' 1979
33.5%
, 24
 
0.4%
& 11
 
0.2%
/ 6
 
0.1%
Space Separator
ValueCountFrequency (%)
158244
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 225
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1845597
91.8%
Common 164380
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 223770
12.1%
o 216846
11.7%
t 181049
9.8%
u 160924
 
8.7%
C 152414
 
8.3%
y 151819
 
8.2%
e 105735
 
5.7%
a 103265
 
5.6%
r 74023
 
4.0%
i 55529
 
3.0%
Other values (48) 420223
22.8%
Common
ValueCountFrequency (%)
158244
96.3%
. 3891
 
2.4%
' 1979
 
1.2%
- 225
 
0.1%
, 24
 
< 0.1%
& 11
 
< 0.1%
/ 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2009968
> 99.9%
None 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 223770
11.1%
o 216846
10.8%
t 181049
 
9.0%
u 160924
 
8.0%
158244
 
7.9%
C 152414
 
7.6%
y 151819
 
7.6%
e 105735
 
5.3%
a 103265
 
5.1%
r 74023
 
3.7%
Other values (49) 481879
24.0%
None
ValueCountFrequency (%)
ó 3
33.3%
ü 2
22.2%
ñ 1
 
11.1%
ç 1
 
11.1%
ø 1
 
11.1%
è 1
 
11.1%

locality
Text

Missing 

Distinct204742
Distinct (%)15.9%
Missing642386
Missing (%)33.3%
Memory size14.7 MiB
2025-01-08T17:52:50.595627image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21793
Median length378
Mean length29.00482474
Min length1

Characters and Unicode

Total characters37242398
Distinct characters139
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126316 ?
Unique (%)9.8%

Sample

1st rowoff Delaware
2nd rowW Coast
3rd rowCape Sable, West Of
4th rowAntarctic Peninsula
5th rowGeorges Bank
ValueCountFrequency (%)
island 342357
 
5.6%
of 336472
 
5.5%
off 252665
 
4.1%
bay 137534
 
2.2%
islands 98147
 
1.6%
bank 84597
 
1.4%
south 74630
 
1.2%
georges 66663
 
1.1%
florida 63432
 
1.0%
river 63370
 
1.0%
Other values (77326) 4636608
75.3%
2025-01-08T17:52:50.863145image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4869900
 
13.1%
a 3498938
 
9.4%
e 2451391
 
6.6%
o 2297059
 
6.2%
n 2155175
 
5.8%
r 1674733
 
4.5%
s 1629255
 
4.4%
i 1598121
 
4.3%
l 1584743
 
4.3%
t 1476204
 
4.0%
Other values (129) 14006879
37.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25267159
67.8%
Uppercase Letter 5379884
 
14.4%
Space Separator 4869900
 
13.1%
Other Punctuation 1210117
 
3.2%
Decimal Number 428539
 
1.2%
Dash Punctuation 41585
 
0.1%
Open Punctuation 15189
 
< 0.1%
Close Punctuation 15060
 
< 0.1%
Control 8574
 
< 0.1%
Math Symbol 5030
 
< 0.1%
Other values (7) 1361
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3498938
13.8%
e 2451391
9.7%
o 2297059
 
9.1%
n 2155175
 
8.5%
r 1674733
 
6.6%
s 1629255
 
6.4%
i 1598121
 
6.3%
l 1584743
 
6.3%
t 1476204
 
5.8%
d 1018124
 
4.0%
Other values (49) 5883416
23.3%
Uppercase Letter
ValueCountFrequency (%)
S 540440
 
10.0%
I 502201
 
9.3%
B 476049
 
8.8%
C 467229
 
8.7%
O 360411
 
6.7%
P 312954
 
5.8%
M 279912
 
5.2%
R 263111
 
4.9%
L 254938
 
4.7%
A 251406
 
4.7%
Other values (19) 1671233
31.1%
Other Punctuation
ValueCountFrequency (%)
, 987016
81.6%
. 147443
 
12.2%
' 31555
 
2.6%
; 24458
 
2.0%
/ 8141
 
0.7%
# 2753
 
0.2%
: 2526
 
0.2%
& 2525
 
0.2%
" 2473
 
0.2%
? 1188
 
0.1%
Other values (6) 39
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 83368
19.5%
0 71352
16.7%
2 58104
13.6%
5 50075
11.7%
3 40076
9.4%
4 32432
 
7.6%
6 30840
 
7.2%
7 22387
 
5.2%
8 20826
 
4.9%
9 19079
 
4.5%
Math Symbol
ValueCountFrequency (%)
+ 4129
82.1%
> 403
 
8.0%
= 370
 
7.4%
~ 121
 
2.4%
< 3
 
0.1%
| 2
 
< 0.1%
± 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 14382
94.7%
[ 789
 
5.2%
{ 18
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 14274
94.8%
] 776
 
5.2%
} 10
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 41584
> 99.9%
1
 
< 0.1%
Control
ValueCountFrequency (%)
8535
99.5%
39
 
0.5%
Space Separator
ValueCountFrequency (%)
4869900
100.0%
Other Symbol
ValueCountFrequency (%)
° 762
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 587
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 6
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 3
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30647043
82.3%
Common 6595355
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3498938
 
11.4%
e 2451391
 
8.0%
o 2297059
 
7.5%
n 2155175
 
7.0%
r 1674733
 
5.5%
s 1629255
 
5.3%
i 1598121
 
5.2%
l 1584743
 
5.2%
t 1476204
 
4.8%
d 1018124
 
3.3%
Other values (78) 11263300
36.8%
Common
ValueCountFrequency (%)
4869900
73.8%
, 987016
 
15.0%
. 147443
 
2.2%
1 83368
 
1.3%
0 71352
 
1.1%
2 58104
 
0.9%
5 50075
 
0.8%
- 41584
 
0.6%
3 40076
 
0.6%
4 32432
 
0.5%
Other values (41) 214005
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37240432
> 99.9%
None 1960
 
< 0.1%
Modifier Letters 3
 
< 0.1%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4869900
 
13.1%
a 3498938
 
9.4%
e 2451391
 
6.6%
o 2297059
 
6.2%
n 2155175
 
5.8%
r 1674733
 
4.5%
s 1629255
 
4.4%
i 1598121
 
4.3%
l 1584743
 
4.3%
t 1476204
 
4.0%
Other values (86) 14004913
37.6%
None
ValueCountFrequency (%)
° 762
38.9%
é 230
 
11.7%
ã 187
 
9.5%
á 141
 
7.2%
ó 138
 
7.0%
í 109
 
5.6%
ñ 78
 
4.0%
ú 55
 
2.8%
ç 36
 
1.8%
ī 36
 
1.8%
Other values (29) 188
 
9.6%
Modifier Letters
ValueCountFrequency (%)
ʻ 3
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

verbatimElevation
Text

Missing 

Distinct126
Distinct (%)27.3%
Missing1925931
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:51.015289image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length4
Mean length10.17099567
Min length4

Characters and Unicode

Total characters4699
Distinct characters51
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)14.1%

Sample

1st row7000
2nd row4070 m.a.s.l.
3rd row4200-4400
4th row2009 +/- 20.1 feet
5th row3000
ValueCountFrequency (%)
collected 53
 
5.6%
on 53
 
5.6%
and 51
 
5.4%
flat 50
 
5.3%
lagoon 50
 
5.3%
slope 50
 
5.3%
m 27
 
2.8%
3800 23
 
2.4%
2550 21
 
2.2%
above 19
 
2.0%
Other values (148) 554
58.3%
2025-01-08T17:52:51.224927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 660
14.0%
489
 
10.4%
l 346
 
7.4%
e 330
 
7.0%
o 320
 
6.8%
a 237
 
5.0%
3 219
 
4.7%
5 218
 
4.6%
t 202
 
4.3%
n 193
 
4.1%
Other values (41) 1485
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2418
51.5%
Decimal Number 1576
33.5%
Space Separator 489
 
10.4%
Other Punctuation 72
 
1.5%
Uppercase Letter 70
 
1.5%
Dash Punctuation 34
 
0.7%
Open Punctuation 17
 
0.4%
Close Punctuation 17
 
0.4%
Math Symbol 6
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 346
14.3%
e 330
13.6%
o 320
13.2%
a 237
9.8%
t 202
8.4%
n 193
8.0%
s 124
 
5.1%
d 118
 
4.9%
f 84
 
3.5%
m 80
 
3.3%
Other values (13) 384
15.9%
Decimal Number
ValueCountFrequency (%)
0 660
41.9%
3 219
 
13.9%
5 218
 
13.8%
2 127
 
8.1%
4 98
 
6.2%
8 74
 
4.7%
1 68
 
4.3%
7 48
 
3.0%
9 43
 
2.7%
6 21
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 40
55.6%
' 20
27.8%
? 6
 
8.3%
, 3
 
4.2%
/ 2
 
2.8%
; 1
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
C 51
72.9%
E 15
 
21.4%
A 2
 
2.9%
I 1
 
1.4%
T 1
 
1.4%
Math Symbol
ValueCountFrequency (%)
~ 3
50.0%
+ 2
33.3%
> 1
 
16.7%
Space Separator
ValueCountFrequency (%)
489
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2488
52.9%
Common 2211
47.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 346
13.9%
e 330
13.3%
o 320
12.9%
a 237
9.5%
t 202
8.1%
n 193
7.8%
s 124
 
5.0%
d 118
 
4.7%
f 84
 
3.4%
m 80
 
3.2%
Other values (18) 454
18.2%
Common
ValueCountFrequency (%)
0 660
29.9%
489
22.1%
3 219
 
9.9%
5 218
 
9.9%
2 127
 
5.7%
4 98
 
4.4%
8 74
 
3.3%
1 68
 
3.1%
7 48
 
2.2%
9 43
 
1.9%
Other values (13) 167
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 660
14.0%
489
 
10.4%
l 346
 
7.4%
e 330
 
7.0%
o 320
 
6.8%
a 237
 
5.0%
3 219
 
4.7%
5 218
 
4.6%
t 202
 
4.3%
n 193
 
4.1%
Other values (41) 1485
31.6%

verbatimDepth
Text

Missing 

Distinct1530
Distinct (%)5.8%
Missing1900149
Missing (%)98.6%
Memory size14.7 MiB
2025-01-08T17:52:51.395293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length99
Median length91
Mean length13.43716659
Min length1

Characters and Unicode

Total characters352645
Distinct characters79
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique721 ?
Unique (%)2.7%

Sample

1st rowSurface
2nd rowmax depth 1772 ft
3rd rowsurface
4th rowIntertidal
5th rowIntertidal
ValueCountFrequency (%)
intertidal 11932
23.4%
surface 4085
 
8.0%
recorded 2871
 
5.6%
depths 2850
 
5.6%
multiple 2846
 
5.6%
shore 1165
 
2.3%
0-300 1120
 
2.2%
0 1069
 
2.1%
depth 1023
 
2.0%
low 964
 
1.9%
Other values (1043) 21003
41.2%
2025-01-08T17:52:51.633851image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 36687
 
10.4%
e 35142
 
10.0%
r 25391
 
7.2%
24684
 
7.0%
d 24177
 
6.9%
l 20651
 
5.9%
a 20481
 
5.8%
i 19392
 
5.5%
0 16029
 
4.5%
n 14727
 
4.2%
Other values (69) 115284
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 250901
71.1%
Decimal Number 39304
 
11.1%
Space Separator 24684
 
7.0%
Uppercase Letter 19960
 
5.7%
Other Punctuation 12440
 
3.5%
Dash Punctuation 4883
 
1.4%
Math Symbol 236
 
0.1%
Open Punctuation 118
 
< 0.1%
Close Punctuation 118
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 36687
14.6%
e 35142
14.0%
r 25391
10.1%
d 24177
9.6%
l 20651
8.2%
a 20481
8.2%
i 19392
7.7%
n 14727
 
5.9%
c 8176
 
3.3%
p 7645
 
3.0%
Other values (15) 38432
15.3%
Uppercase Letter
ValueCountFrequency (%)
I 10834
54.3%
S 4489
22.5%
M 2986
 
15.0%
L 758
 
3.8%
T 218
 
1.1%
B 109
 
0.5%
H 83
 
0.4%
D 78
 
0.4%
C 73
 
0.4%
Z 59
 
0.3%
Other values (14) 273
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 5999
48.2%
: 3688
29.6%
. 1398
 
11.2%
" 841
 
6.8%
; 207
 
1.7%
' 201
 
1.6%
@ 43
 
0.3%
/ 29
 
0.2%
& 22
 
0.2%
? 10
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 16029
40.8%
1 4890
 
12.4%
2 3729
 
9.5%
3 3379
 
8.6%
5 2940
 
7.5%
8 2556
 
6.5%
4 1747
 
4.4%
6 1723
 
4.4%
7 1433
 
3.6%
9 878
 
2.2%
Math Symbol
ValueCountFrequency (%)
< 138
58.5%
= 60
25.4%
+ 24
 
10.2%
~ 14
 
5.9%
Space Separator
ValueCountFrequency (%)
24684
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4883
100.0%
Open Punctuation
ValueCountFrequency (%)
( 118
100.0%
Close Punctuation
ValueCountFrequency (%)
) 118
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 270861
76.8%
Common 81784
 
23.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 36687
13.5%
e 35142
13.0%
r 25391
9.4%
d 24177
8.9%
l 20651
 
7.6%
a 20481
 
7.6%
i 19392
 
7.2%
n 14727
 
5.4%
I 10834
 
4.0%
c 8176
 
3.0%
Other values (39) 55203
20.4%
Common
ValueCountFrequency (%)
24684
30.2%
0 16029
19.6%
, 5999
 
7.3%
1 4890
 
6.0%
- 4883
 
6.0%
2 3729
 
4.6%
: 3688
 
4.5%
3 3379
 
4.1%
5 2940
 
3.6%
8 2556
 
3.1%
Other values (20) 9007
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 352644
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 36687
 
10.4%
e 35142
 
10.0%
r 25391
 
7.2%
24684
 
7.0%
d 24177
 
6.9%
l 20651
 
5.9%
a 20481
 
5.8%
i 19392
 
5.5%
0 16029
 
4.5%
n 14727
 
4.2%
Other values (68) 115283
32.7%
Punctuation
ValueCountFrequency (%)
1
100.0%

decimalLatitude
Text

Missing 

Distinct70087
Distinct (%)7.0%
Missing927346
Missing (%)48.1%
Memory size14.7 MiB
2025-01-08T17:52:51.830265image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.236377268
Min length3

Characters and Unicode

Total characters6230434
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26230 ?
Unique (%)2.6%

Sample

1st row38.7117
2nd row25.2819
3rd row-62.667
4th row42.0833
5th row13.7792
ValueCountFrequency (%)
25.58 10489
 
1.0%
40.6583 8821
 
0.9%
26.17 7320
 
0.7%
26.5 5196
 
0.5%
26.97 3956
 
0.4%
25.7883 3457
 
0.3%
9.4 3109
 
0.3%
9.37 2978
 
0.3%
40.895 2590
 
0.3%
40.66 2520
 
0.3%
Other values (65558) 948611
95.0%
2025-01-08T17:52:52.094263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 999047
16.0%
3 788192
12.7%
2 616242
9.9%
5 525610
8.4%
7 524945
8.4%
4 501558
8.1%
1 480743
7.7%
6 474865
7.6%
8 472201
7.6%
9 377091
 
6.1%
Other values (3) 469940
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5078800
81.5%
Other Punctuation 999047
 
16.0%
Dash Punctuation 152586
 
2.4%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 788192
15.5%
2 616242
12.1%
5 525610
10.3%
7 524945
10.3%
4 501558
9.9%
1 480743
9.5%
6 474865
9.3%
8 472201
9.3%
9 377091
7.4%
0 317353
6.2%
Other Punctuation
ValueCountFrequency (%)
. 999047
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 152586
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6230433
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 999047
16.0%
3 788192
12.7%
2 616242
9.9%
5 525610
8.4%
7 524945
8.4%
4 501558
8.1%
1 480743
7.7%
6 474865
7.6%
8 472201
7.6%
9 377091
 
6.1%
Other values (2) 469939
7.5%
Latin
ValueCountFrequency (%)
E 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6230434
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 999047
16.0%
3 788192
12.7%
2 616242
9.9%
5 525610
8.4%
7 524945
8.4%
4 501558
8.1%
1 480743
7.7%
6 474865
7.6%
8 472201
7.6%
9 377091
 
6.1%
Other values (3) 469940
7.5%

decimalLongitude
Text

Missing 

Distinct74625
Distinct (%)7.5%
Missing927346
Missing (%)48.1%
Memory size14.7 MiB
2025-01-08T17:52:52.310124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.110920707
Min length3

Characters and Unicode

Total characters7104144
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27280 ?
Unique (%)2.7%

Sample

1st row-73.405
2nd row-83.6297
3rd row-54.742
4th row-66.7708
5th row121.586
ValueCountFrequency (%)
80.1 10529
 
1.1%
127.848 4532
 
0.5%
67.7683 4213
 
0.4%
80.13 3738
 
0.4%
82.7 3518
 
0.4%
67.77 2821
 
0.3%
66.775 2592
 
0.3%
81.6633 2462
 
0.2%
70.6731 2397
 
0.2%
67.755 2356
 
0.2%
Other values (69839) 959889
96.1%
2025-01-08T17:52:52.578665image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 999047
14.1%
- 826266
11.6%
7 744654
10.5%
8 682740
9.6%
1 674897
9.5%
6 575520
8.1%
3 562337
7.9%
2 472623
6.7%
5 432907
6.1%
9 409886
5.8%
Other values (2) 723267
10.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5278831
74.3%
Other Punctuation 999047
 
14.1%
Dash Punctuation 826266
 
11.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 744654
14.1%
8 682740
12.9%
1 674897
12.8%
6 575520
10.9%
3 562337
10.7%
2 472623
9.0%
5 432907
8.2%
9 409886
7.8%
0 371211
7.0%
4 352056
6.7%
Other Punctuation
ValueCountFrequency (%)
. 999047
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 826266
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7104144
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 999047
14.1%
- 826266
11.6%
7 744654
10.5%
8 682740
9.6%
1 674897
9.5%
6 575520
8.1%
3 562337
7.9%
2 472623
6.7%
5 432907
6.1%
9 409886
5.8%
Other values (2) 723267
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7104144
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 999047
14.1%
- 826266
11.6%
7 744654
10.5%
8 682740
9.6%
1 674897
9.5%
6 575520
8.1%
3 562337
7.9%
2 472623
6.7%
5 432907
6.1%
9 409886
5.8%
Other values (2) 723267
10.2%
Distinct9
Distinct (%)< 0.1%
Missing1246885
Missing (%)64.7%
Memory size14.7 MiB
2025-01-08T17:52:52.643158image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.60567057
Min length3

Characters and Unicode

Total characters15360734
Distinct characters30
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 670900
33.4%
minutes 648195
32.3%
seconds 648195
32.3%
decimal 22705
 
1.1%
township 7004
 
0.3%
range 7004
 
0.3%
marsden 605
 
< 0.1%
square 605
 
< 0.1%
unknown 532
 
< 0.1%
utm 464
 
< 0.1%
Other values (3) 6
 
< 0.1%
2025-01-08T17:52:52.755244image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3340010
21.7%
s 1974899
12.9%
1326707
 
8.6%
n 1312599
 
8.5%
g 677904
 
4.4%
i 677904
 
4.4%
r 672113
 
4.4%
d 671463
 
4.4%
D 670945
 
4.4%
c 670901
 
4.4%
Other values (20) 3365289
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12049540
78.4%
Uppercase Letter 1984487
 
12.9%
Space Separator 1326707
 
8.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3340010
27.7%
s 1974899
16.4%
n 1312599
 
10.9%
g 677904
 
5.6%
i 677904
 
5.6%
r 672113
 
5.6%
d 671463
 
5.6%
c 670901
 
5.6%
o 655733
 
5.4%
u 648803
 
5.4%
Other values (9) 747211
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
D 670945
33.8%
M 649264
32.7%
S 648800
32.7%
T 7468
 
0.4%
R 7004
 
0.4%
U 998
 
0.1%
Q 3
 
< 0.1%
A 2
 
< 0.1%
F 2
 
< 0.1%
G 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1326707
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14034027
91.4%
Common 1326707
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3340010
23.8%
s 1974899
14.1%
n 1312599
 
9.4%
g 677904
 
4.8%
i 677904
 
4.8%
r 672113
 
4.8%
d 671463
 
4.8%
D 670945
 
4.8%
c 670901
 
4.8%
o 655733
 
4.7%
Other values (19) 2709556
19.3%
Common
ValueCountFrequency (%)
1326707
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15360734
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3340010
21.7%
s 1974899
12.9%
1326707
 
8.6%
n 1312599
 
8.5%
g 677904
 
4.4%
i 677904
 
4.4%
r 672113
 
4.4%
d 671463
 
4.4%
D 670945
 
4.4%
c 670901
 
4.4%
Other values (20) 3365289
21.9%

verbatimSRS
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:52.906299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1936-08-14
2nd row1926-08-24
ValueCountFrequency (%)
1936-08-14 1
50.0%
1926-08-24 1
50.0%
2025-01-08T17:52:52.995147image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 4
20.0%
1 3
15.0%
9 2
10.0%
6 2
10.0%
0 2
10.0%
8 2
10.0%
4 2
10.0%
2 2
10.0%
3 1
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16
80.0%
Dash Punctuation 4
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3
18.8%
9 2
12.5%
6 2
12.5%
0 2
12.5%
8 2
12.5%
4 2
12.5%
2 2
12.5%
3 1
 
6.2%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 20
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 4
20.0%
1 3
15.0%
9 2
10.0%
6 2
10.0%
0 2
10.0%
8 2
10.0%
4 2
10.0%
2 2
10.0%
3 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 4
20.0%
1 3
15.0%
9 2
10.0%
6 2
10.0%
0 2
10.0%
8 2
10.0%
4 2
10.0%
2 2
10.0%
3 1
 
5.0%

footprintSRS
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:53.036932image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row227
2nd row236
ValueCountFrequency (%)
227 1
50.0%
236 1
50.0%
2025-01-08T17:52:53.127786image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

footprintSpatialFit
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:53.167564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row227
2nd row236
ValueCountFrequency (%)
227 1
50.0%
236 1
50.0%
2025-01-08T17:52:53.250515image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

georeferencedBy
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:53.288513image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1936
2nd row1926
ValueCountFrequency (%)
1936 1
50.0%
1926 1
50.0%
2025-01-08T17:52:53.372563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

georeferencedDate
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:53.410563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8
2nd row8
ValueCountFrequency (%)
8 2
100.0%
2025-01-08T17:52:53.492277image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 2
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 2
100.0%

georeferenceProtocol
Text

Missing 

Distinct115
Distinct (%)< 0.1%
Missing1265790
Missing (%)65.7%
Memory size14.7 MiB
2025-01-08T17:52:53.574761image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length20
Mean length20.10026748
Min length2

Characters and Unicode

Total characters13278297
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)< 0.1%

Sample

1st rowunknown, from legacy
2nd rowunknown, from legacy
3rd rowunknown, from legacy
4th rowunknown, from legacy
5th rowunknown, from legacy
ValueCountFrequency (%)
from 509060
26.2%
unknown 507577
26.1%
legacy 505126
26.0%
geolocate 70310
 
3.6%
names 41937
 
2.2%
geographic 41556
 
2.1%
of 35279
 
1.8%
getty 34687
 
1.8%
thesaurus 34686
 
1.8%
may 23191
 
1.2%
Other values (131) 141522
 
7.3%
2025-01-08T17:52:53.722644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1560807
 
11.8%
1284328
 
9.7%
o 1253394
 
9.4%
e 822048
 
6.2%
a 797027
 
6.0%
r 642026
 
4.8%
c 624647
 
4.7%
g 591299
 
4.5%
u 580748
 
4.4%
y 577424
 
4.3%
Other values (54) 4544549
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10761243
81.0%
Space Separator 1284328
 
9.7%
Uppercase Letter 560123
 
4.2%
Other Punctuation 551606
 
4.2%
Decimal Number 114476
 
0.9%
Dash Punctuation 3269
 
< 0.1%
Close Punctuation 1624
 
< 0.1%
Open Punctuation 1624
 
< 0.1%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1560807
14.5%
o 1253394
11.6%
e 822048
 
7.6%
a 797027
 
7.4%
r 642026
 
6.0%
c 624647
 
5.8%
g 591299
 
5.5%
u 580748
 
5.4%
y 577424
 
5.4%
m 572941
 
5.3%
Other values (14) 2738882
25.5%
Uppercase Letter
ValueCountFrequency (%)
G 185819
33.2%
L 76699
13.7%
E 75168
13.4%
O 56836
 
10.1%
N 43901
 
7.8%
T 36738
 
6.6%
M 26366
 
4.7%
S 23940
 
4.3%
U 8298
 
1.5%
I 8277
 
1.5%
Other values (9) 18081
 
3.2%
Decimal Number
ValueCountFrequency (%)
0 52676
46.0%
2 49533
43.3%
9 5900
 
5.2%
4 2928
 
2.6%
1 1976
 
1.7%
5 1442
 
1.3%
8 15
 
< 0.1%
7 4
 
< 0.1%
3 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
, 528985
95.9%
/ 9412
 
1.7%
. 9111
 
1.7%
: 3483
 
0.6%
& 594
 
0.1%
! 18
 
< 0.1%
' 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1284328
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3269
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1624
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1624
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11321366
85.3%
Common 1956931
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1560807
13.8%
o 1253394
 
11.1%
e 822048
 
7.3%
a 797027
 
7.0%
r 642026
 
5.7%
c 624647
 
5.5%
g 591299
 
5.2%
u 580748
 
5.1%
y 577424
 
5.1%
m 572941
 
5.1%
Other values (33) 3299005
29.1%
Common
ValueCountFrequency (%)
1284328
65.6%
, 528985
27.0%
0 52676
 
2.7%
2 49533
 
2.5%
/ 9412
 
0.5%
. 9111
 
0.5%
9 5900
 
0.3%
: 3483
 
0.2%
- 3269
 
0.2%
4 2928
 
0.1%
Other values (11) 7306
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13278297
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1560807
 
11.8%
1284328
 
9.7%
o 1253394
 
9.4%
e 822048
 
6.2%
a 797027
 
6.0%
r 642026
 
4.8%
c 624647
 
4.7%
g 591299
 
4.5%
u 580748
 
4.4%
y 577424
 
4.3%
Other values (54) 4544549
34.2%

georeferenceSources
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing1926390
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:53.769192image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length9.666666667
Min length8

Characters and Unicode

Total characters29
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st rowPARATYPE
2nd rowNORTH_AMERICA
3rd rowPARATYPE
ValueCountFrequency (%)
paratype 2
66.7%
north_america 1
33.3%
2025-01-08T17:52:53.858564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 6
20.7%
P 4
13.8%
R 4
13.8%
T 3
10.3%
E 3
10.3%
Y 2
 
6.9%
N 1
 
3.4%
O 1
 
3.4%
H 1
 
3.4%
_ 1
 
3.4%
Other values (3) 3
10.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 28
96.6%
Connector Punctuation 1
 
3.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 6
21.4%
P 4
14.3%
R 4
14.3%
T 3
10.7%
E 3
10.7%
Y 2
 
7.1%
N 1
 
3.6%
O 1
 
3.6%
H 1
 
3.6%
M 1
 
3.6%
Other values (2) 2
 
7.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28
96.6%
Common 1
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 6
21.4%
P 4
14.3%
R 4
14.3%
T 3
10.7%
E 3
10.7%
Y 2
 
7.1%
N 1
 
3.6%
O 1
 
3.6%
H 1
 
3.6%
M 1
 
3.6%
Other values (2) 2
 
7.1%
Common
ValueCountFrequency (%)
_ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 6
20.7%
P 4
13.8%
R 4
13.8%
T 3
10.3%
E 3
10.3%
Y 2
 
6.9%
N 1
 
3.4%
O 1
 
3.4%
H 1
 
3.4%
_ 1
 
3.4%
Other values (3) 3
10.3%

georeferenceRemarks
Text

Missing 

Distinct4822
Distinct (%)15.9%
Missing1896105
Missing (%)98.4%
Memory size14.7 MiB
2025-01-08T17:52:54.022145image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length118
Mean length23.03717644
Min length1

Characters and Unicode

Total characters697750
Distinct characters78
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3165 ?
Unique (%)10.4%

Sample

1st rowExtended About 16 Km Offshore From Crystal River Power Plant
2nd row0.8 mile west of Montgomery-Polk county line, north side of
3rd rowSan Andreas Fault
4th row6 Mile W Of Watsonville
5th rowfrom Holt data card
ValueCountFrequency (%)
approximate 9789
 
8.9%
from 6478
 
5.9%
river 3464
 
3.2%
of 3097
 
2.8%
about 3076
 
2.8%
16 2974
 
2.7%
km 2970
 
2.7%
plant 2933
 
2.7%
power 2929
 
2.7%
offshore 2929
 
2.7%
Other values (4971) 68760
62.9%
2025-01-08T17:52:54.266842image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
79111
 
11.3%
a 60517
 
8.7%
e 55652
 
8.0%
o 49194
 
7.1%
r 47507
 
6.8%
t 40249
 
5.8%
i 29470
 
4.2%
n 26681
 
3.8%
p 24672
 
3.5%
m 24234
 
3.5%
Other values (68) 260463
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 519837
74.5%
Space Separator 79111
 
11.3%
Uppercase Letter 71789
 
10.3%
Decimal Number 14985
 
2.1%
Other Punctuation 10495
 
1.5%
Close Punctuation 574
 
0.1%
Open Punctuation 570
 
0.1%
Dash Punctuation 354
 
0.1%
Math Symbol 35
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 60517
11.6%
e 55652
10.7%
o 49194
 
9.5%
r 47507
 
9.1%
t 40249
 
7.7%
i 29470
 
5.7%
n 26681
 
5.1%
p 24672
 
4.7%
m 24234
 
4.7%
l 23891
 
4.6%
Other values (16) 137770
26.5%
Uppercase Letter
ValueCountFrequency (%)
P 8828
12.3%
R 7369
10.3%
C 6873
 
9.6%
O 6477
 
9.0%
B 4747
 
6.6%
A 4364
 
6.1%
F 4160
 
5.8%
E 3949
 
5.5%
S 3870
 
5.4%
K 3632
 
5.1%
Other values (16) 17520
24.4%
Decimal Number
ValueCountFrequency (%)
1 4141
27.6%
6 3403
22.7%
5 1661
11.1%
0 1443
 
9.6%
3 1434
 
9.6%
2 951
 
6.3%
4 876
 
5.8%
7 488
 
3.3%
8 412
 
2.7%
9 176
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 5320
50.7%
. 2062
 
19.6%
/ 1922
 
18.3%
: 460
 
4.4%
' 363
 
3.5%
; 284
 
2.7%
" 42
 
0.4%
& 23
 
0.2%
# 19
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 564
98.3%
] 10
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 560
98.2%
[ 10
 
1.8%
Space Separator
ValueCountFrequency (%)
79111
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 354
100.0%
Math Symbol
ValueCountFrequency (%)
+ 35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 591626
84.8%
Common 106124
 
15.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 60517
 
10.2%
e 55652
 
9.4%
o 49194
 
8.3%
r 47507
 
8.0%
t 40249
 
6.8%
i 29470
 
5.0%
n 26681
 
4.5%
p 24672
 
4.2%
m 24234
 
4.1%
l 23891
 
4.0%
Other values (42) 209559
35.4%
Common
ValueCountFrequency (%)
79111
74.5%
, 5320
 
5.0%
1 4141
 
3.9%
6 3403
 
3.2%
. 2062
 
1.9%
/ 1922
 
1.8%
5 1661
 
1.6%
0 1443
 
1.4%
3 1434
 
1.4%
2 951
 
0.9%
Other values (16) 4676
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 697750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
79111
 
11.3%
a 60517
 
8.7%
e 55652
 
8.0%
o 49194
 
7.1%
r 47507
 
6.8%
t 40249
 
5.8%
i 29470
 
4.2%
n 26681
 
3.8%
p 24672
 
3.5%
m 24234
 
3.5%
Other values (68) 260463
37.3%

latestEonOrHighestEonothem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:54.313844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowUS
ValueCountFrequency (%)
us 1
100.0%
2025-01-08T17:52:54.398395image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 1
50.0%
S 1
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 1
50.0%
S 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 1
50.0%
S 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 1
50.0%
S 1
50.0%

earliestEraOrLowestErathem
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:54.435504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIdaho
ValueCountFrequency (%)
idaho 1
100.0%
2025-01-08T17:52:54.523205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1
20.0%
d 1
20.0%
a 1
20.0%
h 1
20.0%
o 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
80.0%
Uppercase Letter 1
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 1
25.0%
a 1
25.0%
h 1
25.0%
o 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
I 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1
20.0%
d 1
20.0%
a 1
20.0%
h 1
20.0%
o 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1
20.0%
d 1
20.0%
a 1
20.0%
h 1
20.0%
o 1
20.0%
Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:54.564205image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row6482728
2nd row2504455
ValueCountFrequency (%)
6482728 1
50.0%
2504455 1
50.0%
2025-01-08T17:52:54.650253image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%
Distinct3
Distinct (%)100.0%
Missing1926390
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:54.702957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length75
Median length37
Mean length46.66666667
Min length28

Characters and Unicode

Total characters140
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowNorth America, North Pacific Ocean, Departure Bay, Canada, British Columbia
2nd rowNorth America, United States, Georgia
3rd rowNorth America, United States
ValueCountFrequency (%)
north 4
21.1%
america 3
15.8%
united 2
10.5%
states 2
10.5%
pacific 1
 
5.3%
ocean 1
 
5.3%
departure 1
 
5.3%
bay 1
 
5.3%
canada 1
 
5.3%
british 1
 
5.3%
Other values (2) 2
10.5%
2025-01-08T17:52:54.805344image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
 
11.4%
a 14
 
10.0%
t 12
 
8.6%
r 11
 
7.9%
e 11
 
7.9%
i 11
 
7.9%
, 7
 
5.0%
o 6
 
4.3%
c 6
 
4.3%
h 5
 
3.6%
Other values (21) 41
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 98
70.0%
Uppercase Letter 19
 
13.6%
Space Separator 16
 
11.4%
Other Punctuation 7
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14
14.3%
t 12
12.2%
r 11
11.2%
e 11
11.2%
i 11
11.2%
o 6
6.1%
c 6
6.1%
h 5
 
5.1%
n 4
 
4.1%
m 4
 
4.1%
Other values (9) 14
14.3%
Uppercase Letter
ValueCountFrequency (%)
N 4
21.1%
A 3
15.8%
S 2
10.5%
U 2
10.5%
B 2
10.5%
C 2
10.5%
G 1
 
5.3%
O 1
 
5.3%
D 1
 
5.3%
P 1
 
5.3%
Space Separator
ValueCountFrequency (%)
16
100.0%
Other Punctuation
ValueCountFrequency (%)
, 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 117
83.6%
Common 23
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14
12.0%
t 12
 
10.3%
r 11
 
9.4%
e 11
 
9.4%
i 11
 
9.4%
o 6
 
5.1%
c 6
 
5.1%
h 5
 
4.3%
n 4
 
3.4%
N 4
 
3.4%
Other values (19) 33
28.2%
Common
ValueCountFrequency (%)
16
69.6%
, 7
30.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 140
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16
 
11.4%
a 14
 
10.0%
t 12
 
8.6%
r 11
 
7.9%
e 11
 
7.9%
i 11
 
7.9%
, 7
 
5.0%
o 6
 
4.3%
c 6
 
4.3%
h 5
 
3.6%
Other values (21) 41
29.3%

earliestAgeOrLowestStage
Text

Constant  Missing 

Distinct1
Distinct (%)33.3%
Missing1926390
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:54.849915image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters39
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 3
100.0%
2025-01-08T17:52:54.940439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 6
15.4%
A 6
15.4%
N 3
7.7%
O 3
7.7%
T 3
7.7%
H 3
7.7%
_ 3
7.7%
M 3
7.7%
E 3
7.7%
I 3
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 36
92.3%
Connector Punctuation 3
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 6
16.7%
A 6
16.7%
N 3
8.3%
O 3
8.3%
T 3
8.3%
H 3
8.3%
M 3
8.3%
E 3
8.3%
I 3
8.3%
C 3
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36
92.3%
Common 3
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 6
16.7%
A 6
16.7%
N 3
8.3%
O 3
8.3%
T 3
8.3%
H 3
8.3%
M 3
8.3%
E 3
8.3%
I 3
8.3%
C 3
8.3%
Common
ValueCountFrequency (%)
_ 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 6
15.4%
A 6
15.4%
N 3
7.7%
O 3
7.7%
T 3
7.7%
H 3
7.7%
_ 3
7.7%
M 3
7.7%
E 3
7.7%
I 3
7.7%

latestAgeOrHighestStage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:54.985440image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length34
Mean length34
Min length34

Characters and Unicode

Total characters34
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNorth Pacific Ocean, Departure Bay
ValueCountFrequency (%)
north 1
20.0%
pacific 1
20.0%
ocean 1
20.0%
departure 1
20.0%
bay 1
20.0%
2025-01-08T17:52:55.082986image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
11.8%
a 4
 
11.8%
r 3
 
8.8%
c 3
 
8.8%
e 3
 
8.8%
t 2
 
5.9%
i 2
 
5.9%
N 1
 
2.9%
, 1
 
2.9%
B 1
 
2.9%
Other values (10) 10
29.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24
70.6%
Uppercase Letter 5
 
14.7%
Space Separator 4
 
11.8%
Other Punctuation 1
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
16.7%
r 3
12.5%
c 3
12.5%
e 3
12.5%
t 2
8.3%
i 2
8.3%
u 1
 
4.2%
p 1
 
4.2%
f 1
 
4.2%
n 1
 
4.2%
Other values (3) 3
12.5%
Uppercase Letter
ValueCountFrequency (%)
N 1
20.0%
B 1
20.0%
D 1
20.0%
O 1
20.0%
P 1
20.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 29
85.3%
Common 5
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
13.8%
r 3
 
10.3%
c 3
 
10.3%
e 3
 
10.3%
t 2
 
6.9%
i 2
 
6.9%
N 1
 
3.4%
B 1
 
3.4%
u 1
 
3.4%
p 1
 
3.4%
Other values (8) 8
27.6%
Common
ValueCountFrequency (%)
4
80.0%
, 1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4
 
11.8%
a 4
 
11.8%
r 3
 
8.8%
c 3
 
8.8%
e 3
 
8.8%
t 2
 
5.9%
i 2
 
5.9%
N 1
 
2.9%
, 1
 
2.9%
B 1
 
2.9%
Other values (10) 10
29.4%
Distinct4
Distinct (%)80.0%
Missing1926388
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:55.142634image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length2
Mean length18.8
Min length2

Characters and Unicode

Total characters94
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st rowHemionchos striatus Campbell & Beveridge, 2006
2nd rowCA
3rd rowUS
4th rowUS
5th rowConspicuum icteridorum Denton & Byrd, 1951
ValueCountFrequency (%)
us 2
13.3%
2
13.3%
hemionchos 1
 
6.7%
striatus 1
 
6.7%
campbell 1
 
6.7%
beveridge 1
 
6.7%
2006 1
 
6.7%
ca 1
 
6.7%
conspicuum 1
 
6.7%
icteridorum 1
 
6.7%
Other values (3) 3
20.0%
2025-01-08T17:52:55.252583image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
10.6%
e 7
 
7.4%
i 6
 
6.4%
o 5
 
5.3%
r 5
 
5.3%
u 4
 
4.3%
m 4
 
4.3%
n 4
 
4.3%
s 4
 
4.3%
t 4
 
4.3%
Other values (25) 41
43.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 60
63.8%
Uppercase Letter 12
 
12.8%
Space Separator 10
 
10.6%
Decimal Number 8
 
8.5%
Other Punctuation 4
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 7
11.7%
i 6
10.0%
o 5
 
8.3%
r 5
 
8.3%
u 4
 
6.7%
m 4
 
6.7%
n 4
 
6.7%
s 4
 
6.7%
t 4
 
6.7%
d 3
 
5.0%
Other values (9) 14
23.3%
Uppercase Letter
ValueCountFrequency (%)
C 3
25.0%
U 2
16.7%
B 2
16.7%
S 2
16.7%
A 1
 
8.3%
D 1
 
8.3%
H 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
0 2
25.0%
1 2
25.0%
2 1
12.5%
6 1
12.5%
9 1
12.5%
5 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 2
50.0%
& 2
50.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 72
76.6%
Common 22
 
23.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 7
 
9.7%
i 6
 
8.3%
o 5
 
6.9%
r 5
 
6.9%
u 4
 
5.6%
m 4
 
5.6%
n 4
 
5.6%
s 4
 
5.6%
t 4
 
5.6%
d 3
 
4.2%
Other values (16) 26
36.1%
Common
ValueCountFrequency (%)
10
45.5%
0 2
 
9.1%
, 2
 
9.1%
& 2
 
9.1%
1 2
 
9.1%
2 1
 
4.5%
6 1
 
4.5%
9 1
 
4.5%
5 1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 94
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10
 
10.6%
e 7
 
7.4%
i 6
 
6.4%
o 5
 
5.3%
r 5
 
5.3%
u 4
 
4.3%
m 4
 
4.3%
n 4
 
4.3%
s 4
 
4.3%
t 4
 
4.3%
Other values (25) 41
43.6%

group
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:55.297764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11.5
Mean length11.5
Min length7

Characters and Unicode

Total characters23
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowBritish Columbia
2nd rowGeorgia
ValueCountFrequency (%)
british 1
33.3%
columbia 1
33.3%
georgia 1
33.3%
2025-01-08T17:52:55.386558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4
17.4%
o 2
 
8.7%
a 2
 
8.7%
r 2
 
8.7%
u 1
 
4.3%
e 1
 
4.3%
G 1
 
4.3%
b 1
 
4.3%
m 1
 
4.3%
B 1
 
4.3%
Other values (7) 7
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19
82.6%
Uppercase Letter 3
 
13.0%
Space Separator 1
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
21.1%
o 2
10.5%
a 2
10.5%
r 2
10.5%
u 1
 
5.3%
e 1
 
5.3%
b 1
 
5.3%
m 1
 
5.3%
l 1
 
5.3%
h 1
 
5.3%
Other values (3) 3
15.8%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
B 1
33.3%
C 1
33.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22
95.7%
Common 1
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
18.2%
o 2
 
9.1%
a 2
 
9.1%
r 2
 
9.1%
u 1
 
4.5%
e 1
 
4.5%
G 1
 
4.5%
b 1
 
4.5%
m 1
 
4.5%
B 1
 
4.5%
Other values (6) 6
27.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4
17.4%
o 2
 
8.7%
a 2
 
8.7%
r 2
 
8.7%
u 1
 
4.3%
e 1
 
4.3%
G 1
 
4.3%
b 1
 
4.3%
m 1
 
4.3%
B 1
 
4.3%
Other values (7) 7
30.4%

bed
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:55.425462image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMoultrie
ValueCountFrequency (%)
moultrie 1
100.0%
2025-01-08T17:52:55.516459image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1
12.5%
o 1
12.5%
u 1
12.5%
l 1
12.5%
t 1
12.5%
r 1
12.5%
i 1
12.5%
e 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
14.3%
u 1
14.3%
l 1
14.3%
t 1
14.3%
r 1
14.3%
i 1
14.3%
e 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1
12.5%
o 1
12.5%
u 1
12.5%
l 1
12.5%
t 1
12.5%
r 1
12.5%
i 1
12.5%
e 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1
12.5%
o 1
12.5%
u 1
12.5%
l 1
12.5%
t 1
12.5%
r 1
12.5%
i 1
12.5%
e 1
12.5%
Distinct7
Distinct (%)< 0.1%
Missing1908260
Missing (%)99.1%
Memory size14.7 MiB
2025-01-08T17:52:55.563892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length76
Median length3
Mean length3.553796945
Min length3

Characters and Unicode

Total characters64441
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowcf.
2nd rowcf.
3rd rowuncertain
4th rowcf.
5th rowcf.
ValueCountFrequency (%)
cf 15638
86.2%
uncertain 1489
 
8.2%
aff 600
 
3.3%
near 404
 
2.2%
animalia 2
 
< 0.1%
platyhelminthes 2
 
< 0.1%
cestoda 1
 
< 0.1%
trematoda 1
 
< 0.1%
digenea 1
 
< 0.1%
plagiorchiida 1
 
< 0.1%
2025-01-08T17:52:55.671298image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 17130
26.6%
f 16838
26.1%
. 16238
25.2%
n 3387
 
5.3%
a 2506
 
3.9%
e 1903
 
3.0%
r 1896
 
2.9%
i 1502
 
2.3%
t 1495
 
2.3%
u 1487
 
2.3%
Other values (16) 59
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 48178
74.8%
Other Punctuation 16245
 
25.2%
Uppercase Letter 11
 
< 0.1%
Space Separator 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 17130
35.6%
f 16838
34.9%
n 3387
 
7.0%
a 2506
 
5.2%
e 1903
 
3.9%
r 1896
 
3.9%
i 1502
 
3.1%
t 1495
 
3.1%
u 1487
 
3.1%
l 8
 
< 0.1%
Other values (7) 26
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
P 3
27.3%
A 2
18.2%
U 2
18.2%
D 2
18.2%
C 1
 
9.1%
T 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 16238
> 99.9%
, 7
 
< 0.1%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48189
74.8%
Common 16252
 
25.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 17130
35.5%
f 16838
34.9%
n 3387
 
7.0%
a 2506
 
5.2%
e 1903
 
3.9%
r 1896
 
3.9%
i 1502
 
3.1%
t 1495
 
3.1%
u 1487
 
3.1%
l 8
 
< 0.1%
Other values (13) 37
 
0.1%
Common
ValueCountFrequency (%)
. 16238
99.9%
, 7
 
< 0.1%
7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64441
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 17130
26.6%
f 16838
26.1%
. 16238
25.2%
n 3387
 
5.3%
a 2506
 
3.9%
e 1903
 
3.0%
r 1896
 
2.9%
i 1502
 
2.3%
t 1495
 
2.3%
u 1487
 
2.3%
Other values (16) 59
 
0.1%

typeStatus
Text

Missing 

Distinct11
Distinct (%)< 0.1%
Missing1841066
Missing (%)95.6%
Memory size14.7 MiB
2025-01-08T17:52:55.720836image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length7.724987401
Min length4

Characters and Unicode

Total characters659150
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPARATYPE
2nd rowHOLOTYPE
3rd rowPARATYPE
4th rowHOLOTYPE
5th rowPARATYPE
ValueCountFrequency (%)
paratype 40578
47.6%
holotype 25358
29.7%
syntype 9555
 
11.2%
type 4807
 
5.6%
allotype 2818
 
3.3%
lectotype 862
 
1.0%
paralectotype 795
 
0.9%
neotype 294
 
0.3%
hapantotype 242
 
0.3%
paraneotype 16
 
< 0.1%
2025-01-08T17:52:55.815836image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 126956
19.3%
Y 94880
14.4%
E 87292
13.2%
T 87224
13.2%
A 86082
13.1%
O 55743
8.5%
R 41389
 
6.3%
L 32651
 
5.0%
H 25600
 
3.9%
N 10107
 
1.5%
Other values (7) 11226
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 659136
> 99.9%
Lowercase Letter 14
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 126956
19.3%
Y 94880
14.4%
E 87292
13.2%
T 87224
13.2%
A 86082
13.1%
O 55743
8.5%
R 41389
 
6.3%
L 32651
 
5.0%
H 25600
 
3.9%
N 10107
 
1.5%
Other values (2) 11212
 
1.7%
Lowercase Letter
ValueCountFrequency (%)
i 4
28.6%
a 4
28.6%
n 2
14.3%
m 2
14.3%
l 2
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 659150
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 126956
19.3%
Y 94880
14.4%
E 87292
13.2%
T 87224
13.2%
A 86082
13.1%
O 55743
8.5%
R 41389
 
6.3%
L 32651
 
5.0%
H 25600
 
3.9%
N 10107
 
1.5%
Other values (7) 11226
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 659150
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 126956
19.3%
Y 94880
14.4%
E 87292
13.2%
T 87224
13.2%
A 86082
13.1%
O 55743
8.5%
R 41389
 
6.3%
L 32651
 
5.0%
H 25600
 
3.9%
N 10107
 
1.5%
Other values (7) 11226
 
1.7%

identifiedBy
Text

Missing 

Distinct13461
Distinct (%)1.6%
Missing1085208
Missing (%)56.3%
Memory size14.7 MiB
2025-01-08T17:52:55.988689image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length226
Median length133
Mean length38.24106825
Min length2

Characters and Unicode

Total characters32167813
Distinct characters94
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4200 ?
Unique (%)0.5%

Sample

1st rowOpresko, Dennis M., Oak Ridge National Laboratory (UNITED STATES)
2nd rowNance
3rd rowMah, Christopher, (IZ), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
4th rowVerrill, Addison E., Peabody Museum, Yale
5th rowJudkins, D.
ValueCountFrequency (%)
of 247193
 
5.3%
museum 200643
 
4.3%
national 197127
 
4.2%
institution 188591
 
4.1%
smithsonian 186061
 
4.0%
natural 185777
 
4.0%
history 185423
 
4.0%
united 130413
 
2.8%
states 129643
 
2.8%
87200
 
1.9%
Other values (9433) 2904278
62.6%
2025-01-08T17:52:56.239749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3801164
 
11.8%
a 2080528
 
6.5%
i 2056250
 
6.4%
t 2013216
 
6.3%
n 1896071
 
5.9%
o 1744817
 
5.4%
e 1500120
 
4.7%
r 1384928
 
4.3%
s 1382760
 
4.3%
, 1349377
 
4.2%
Other values (84) 12958582
40.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19467461
60.5%
Uppercase Letter 5957582
 
18.5%
Space Separator 3801164
 
11.8%
Other Punctuation 2377372
 
7.4%
Open Punctuation 230321
 
0.7%
Close Punctuation 230321
 
0.7%
Dash Punctuation 97651
 
0.3%
Decimal Number 5852
 
< 0.1%
Math Symbol 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2080528
10.7%
i 2056250
10.6%
t 2013216
10.3%
n 1896071
9.7%
o 1744817
9.0%
e 1500120
7.7%
r 1384928
7.1%
s 1382760
7.1%
u 1079811
 
5.5%
l 969734
 
5.0%
Other values (37) 3359226
17.3%
Uppercase Letter
ValueCountFrequency (%)
S 646353
 
10.8%
N 570409
 
9.6%
M 471253
 
7.9%
I 456265
 
7.7%
T 454120
 
7.6%
H 422947
 
7.1%
E 378825
 
6.4%
A 333756
 
5.6%
D 272623
 
4.6%
C 241437
 
4.1%
Other values (18) 1709594
28.7%
Other Punctuation
ValueCountFrequency (%)
, 1349377
56.8%
. 937315
39.4%
; 64078
 
2.7%
/ 16442
 
0.7%
& 5588
 
0.2%
' 4526
 
0.2%
" 46
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 2732
46.7%
1 2732
46.7%
2 148
 
2.5%
0 92
 
1.6%
6 74
 
1.3%
9 74
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 97644
> 99.9%
7
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3801164
100.0%
Open Punctuation
ValueCountFrequency (%)
( 230321
100.0%
Close Punctuation
ValueCountFrequency (%)
) 230321
100.0%
Math Symbol
ValueCountFrequency (%)
+ 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25425043
79.0%
Common 6742770
 
21.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2080528
 
8.2%
i 2056250
 
8.1%
t 2013216
 
7.9%
n 1896071
 
7.5%
o 1744817
 
6.9%
e 1500120
 
5.9%
r 1384928
 
5.4%
s 1382760
 
5.4%
u 1079811
 
4.2%
l 969734
 
3.8%
Other values (65) 9316808
36.6%
Common
ValueCountFrequency (%)
3801164
56.4%
, 1349377
 
20.0%
. 937315
 
13.9%
( 230321
 
3.4%
) 230321
 
3.4%
- 97644
 
1.4%
; 64078
 
1.0%
/ 16442
 
0.2%
& 5588
 
0.1%
' 4526
 
0.1%
Other values (9) 5994
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32162269
> 99.9%
None 5537
 
< 0.1%
Punctuation 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3801164
 
11.8%
a 2080528
 
6.5%
i 2056250
 
6.4%
t 2013216
 
6.3%
n 1896071
 
5.9%
o 1744817
 
5.4%
e 1500120
 
4.7%
r 1384928
 
4.3%
s 1382760
 
4.3%
, 1349377
 
4.2%
Other values (60) 12953038
40.3%
None
ValueCountFrequency (%)
é 1460
26.4%
í 1289
23.3%
á 848
15.3%
ñ 436
 
7.9%
ã 401
 
7.2%
è 285
 
5.1%
ö 217
 
3.9%
ç 159
 
2.9%
ó 99
 
1.8%
ø 98
 
1.8%
Other values (13) 245
 
4.4%
Punctuation
ValueCountFrequency (%)
7
100.0%

identifiedByID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:56.292748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8
Min length7

Characters and Unicode

Total characters16
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCestoda
2nd rowTrematoda
ValueCountFrequency (%)
cestoda 1
50.0%
trematoda 1
50.0%
2025-01-08T17:52:56.392993image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3
18.8%
e 2
12.5%
t 2
12.5%
o 2
12.5%
d 2
12.5%
C 1
 
6.2%
s 1
 
6.2%
T 1
 
6.2%
r 1
 
6.2%
m 1
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14
87.5%
Uppercase Letter 2
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
21.4%
e 2
14.3%
t 2
14.3%
o 2
14.3%
d 2
14.3%
s 1
 
7.1%
r 1
 
7.1%
m 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
T 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
18.8%
e 2
12.5%
t 2
12.5%
o 2
12.5%
d 2
12.5%
C 1
 
6.2%
s 1
 
6.2%
T 1
 
6.2%
r 1
 
6.2%
m 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3
18.8%
e 2
12.5%
t 2
12.5%
o 2
12.5%
d 2
12.5%
C 1
 
6.2%
s 1
 
6.2%
T 1
 
6.2%
r 1
 
6.2%
m 1
 
6.2%

dateIdentified
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:56.438873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13.5
Mean length13.5
Min length13

Characters and Unicode

Total characters27
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowTrypanorhyncha
2nd rowPlagiorchiida
ValueCountFrequency (%)
trypanorhyncha 1
50.0%
plagiorchiida 1
50.0%
2025-01-08T17:52:56.539827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
14.8%
r 3
11.1%
h 3
11.1%
i 3
11.1%
y 2
7.4%
n 2
7.4%
o 2
7.4%
c 2
7.4%
T 1
 
3.7%
p 1
 
3.7%
Other values (4) 4
14.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
92.6%
Uppercase Letter 2
 
7.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
16.0%
r 3
12.0%
h 3
12.0%
i 3
12.0%
y 2
8.0%
n 2
8.0%
o 2
8.0%
c 2
8.0%
p 1
 
4.0%
l 1
 
4.0%
Other values (2) 2
8.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
P 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
14.8%
r 3
11.1%
h 3
11.1%
i 3
11.1%
y 2
7.4%
n 2
7.4%
o 2
7.4%
c 2
7.4%
T 1
 
3.7%
p 1
 
3.7%
Other values (4) 4
14.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
14.8%
r 3
11.1%
h 3
11.1%
i 3
11.1%
y 2
7.4%
n 2
7.4%
o 2
7.4%
c 2
7.4%
T 1
 
3.7%
p 1
 
3.7%
Other values (4) 4
14.8%
Distinct3
Distinct (%)100.0%
Missing1926390
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:56.586826image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length14
Mean length12.66666667
Min length7

Characters and Unicode

Total characters38
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowEutetrarhynchidae
2nd row31.1435
3rd rowDicrocoeliidae
ValueCountFrequency (%)
eutetrarhynchidae 1
33.3%
31.1435 1
33.3%
dicrocoeliidae 1
33.3%
2025-01-08T17:52:56.680924image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4
 
10.5%
e 4
 
10.5%
r 3
 
7.9%
a 3
 
7.9%
c 3
 
7.9%
t 2
 
5.3%
h 2
 
5.3%
o 2
 
5.3%
d 2
 
5.3%
3 2
 
5.3%
Other values (10) 11
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29
76.3%
Decimal Number 6
 
15.8%
Uppercase Letter 2
 
5.3%
Other Punctuation 1
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
13.8%
e 4
13.8%
r 3
10.3%
a 3
10.3%
c 3
10.3%
t 2
6.9%
h 2
6.9%
o 2
6.9%
d 2
6.9%
u 1
 
3.4%
Other values (3) 3
10.3%
Decimal Number
ValueCountFrequency (%)
3 2
33.3%
1 2
33.3%
4 1
16.7%
5 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
D 1
50.0%
E 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31
81.6%
Common 7
 
18.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
12.9%
e 4
12.9%
r 3
9.7%
a 3
9.7%
c 3
9.7%
t 2
 
6.5%
h 2
 
6.5%
o 2
 
6.5%
d 2
 
6.5%
D 1
 
3.2%
Other values (5) 5
16.1%
Common
ValueCountFrequency (%)
3 2
28.6%
1 2
28.6%
4 1
14.3%
5 1
14.3%
. 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 4
 
10.5%
e 4
 
10.5%
r 3
 
7.9%
a 3
 
7.9%
c 3
 
7.9%
t 2
 
5.3%
h 2
 
5.3%
o 2
 
5.3%
d 2
 
5.3%
3 2
 
5.3%
Other values (10) 11
28.9%

identificationRemarks
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:56.720923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-83.7685
ValueCountFrequency (%)
83.7685 1
100.0%
2025-01-08T17:52:56.807665image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 2
25.0%
- 1
12.5%
3 1
12.5%
. 1
12.5%
7 1
12.5%
6 1
12.5%
5 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
75.0%
Dash Punctuation 1
 
12.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2
33.3%
3 1
16.7%
7 1
16.7%
6 1
16.7%
5 1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 2
25.0%
- 1
12.5%
3 1
12.5%
. 1
12.5%
7 1
12.5%
6 1
12.5%
5 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 2
25.0%
- 1
12.5%
3 1
12.5%
. 1
12.5%
7 1
12.5%
6 1
12.5%
5 1
12.5%
Distinct94526
Distinct (%)4.9%
Missing2069
Missing (%)0.1%
Memory size14.7 MiB
2025-01-08T17:52:56.997138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.457629796
Min length1

Characters and Unicode

Total characters12426572
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27027 ?
Unique (%)1.4%

Sample

1st row2237081
2nd row5189992
3rd row2258402
4th row5187825
5th row9722403
ValueCountFrequency (%)
225 23786
 
1.2%
5967481 15294
 
0.8%
105 11162
 
0.6%
52 8679
 
0.5%
7296 8105
 
0.4%
637 6531
 
0.3%
137 6505
 
0.3%
6540 4668
 
0.2%
255 4580
 
0.2%
256 4175
 
0.2%
Other values (94516) 1830839
95.1%
2025-01-08T17:52:57.253209image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Other values (12) 20
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12426552
> 99.9%
Lowercase Letter 18
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Lowercase Letter
ValueCountFrequency (%)
o 3
16.7%
n 2
11.1%
s 2
11.1%
i 2
11.1%
c 2
11.1%
u 2
11.1%
m 2
11.1%
p 1
 
5.6%
e 1
 
5.6%
h 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
H 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12426552
> 99.9%
Latin 20
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
15.0%
n 2
10.0%
s 2
10.0%
i 2
10.0%
c 2
10.0%
u 2
10.0%
m 2
10.0%
C 1
 
5.0%
p 1
 
5.0%
H 1
 
5.0%
Other values (2) 2
10.0%
Common
ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12426572
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Other values (12) 20
 
< 0.1%

parentNameUsageID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:57.308207image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowHemionchos
2nd rowConspicuum
ValueCountFrequency (%)
hemionchos 1
50.0%
conspicuum 1
50.0%
2025-01-08T17:52:57.395838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3
15.0%
m 2
10.0%
i 2
10.0%
n 2
10.0%
c 2
10.0%
s 2
10.0%
u 2
10.0%
H 1
 
5.0%
e 1
 
5.0%
h 1
 
5.0%
Other values (2) 2
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
90.0%
Uppercase Letter 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
16.7%
m 2
11.1%
i 2
11.1%
n 2
11.1%
c 2
11.1%
s 2
11.1%
u 2
11.1%
e 1
 
5.6%
h 1
 
5.6%
p 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
H 1
50.0%
C 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
15.0%
m 2
10.0%
i 2
10.0%
n 2
10.0%
c 2
10.0%
s 2
10.0%
u 2
10.0%
H 1
 
5.0%
e 1
 
5.0%
h 1
 
5.0%
Other values (2) 2
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3
15.0%
m 2
10.0%
i 2
10.0%
n 2
10.0%
c 2
10.0%
s 2
10.0%
u 2
10.0%
H 1
 
5.0%
e 1
 
5.0%
h 1
 
5.0%
Other values (2) 2
10.0%

namePublishedInID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:57.439287image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9.5
Mean length9.5
Min length8

Characters and Unicode

Total characters19
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowstriatus
2nd rowicteridorum
ValueCountFrequency (%)
striatus 1
50.0%
icteridorum 1
50.0%
2025-01-08T17:52:57.535404image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%
Distinct113079
Distinct (%)5.9%
Missing6
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:52:57.837575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length168
Median length102
Mean length29.16433821
Min length5

Characters and Unicode

Total characters56181802
Distinct characters116
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38721 ?
Unique (%)2.0%

Sample

1st rowScypha Gray, 1821
2nd rowBulla striata Bruguière, 1792
3rd rowStylopathes columnaris (Duchassaing, 1870)
4th rowOphiothrix suensonii Lütken, 1856
5th rowCypraea labrolineata Gaskoin, 1849
ValueCountFrequency (%)
136410
 
2.0%
linnaeus 96753
 
1.4%
1758 81495
 
1.2%
say 50998
 
0.8%
lamarck 40009
 
0.6%
dall 28184
 
0.4%
conus 24224
 
0.4%
gastropoda 23786
 
0.4%
1791 23649
 
0.3%
gmelin 23215
 
0.3%
Other values (70965) 6239236
92.2%
2025-01-08T17:52:58.111597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4939373
 
8.8%
4841572
 
8.6%
i 3725884
 
6.6%
e 3410133
 
6.1%
r 2844813
 
5.1%
s 2669041
 
4.8%
o 2472444
 
4.4%
l 2451221
 
4.4%
n 2432205
 
4.3%
t 1939529
 
3.5%
Other values (106) 24455587
43.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37415604
66.6%
Decimal Number 6238420
 
11.1%
Space Separator 4841572
 
8.6%
Uppercase Letter 4124724
 
7.3%
Other Punctuation 2108041
 
3.8%
Close Punctuation 713780
 
1.3%
Open Punctuation 713780
 
1.3%
Dash Punctuation 25881
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4939373
13.2%
i 3725884
10.0%
e 3410133
 
9.1%
r 2844813
 
7.6%
s 2669041
 
7.1%
o 2472444
 
6.6%
l 2451221
 
6.6%
n 2432205
 
6.5%
t 1939529
 
5.2%
u 1855689
 
5.0%
Other values (50) 8675272
23.2%
Uppercase Letter
ValueCountFrequency (%)
S 397556
 
9.6%
C 388252
 
9.4%
P 377283
 
9.1%
L 347759
 
8.4%
M 289125
 
7.0%
A 288263
 
7.0%
B 241357
 
5.9%
H 229456
 
5.6%
G 212079
 
5.1%
D 169189
 
4.1%
Other values (27) 1184405
28.7%
Decimal Number
ValueCountFrequency (%)
1 1867300
29.9%
8 1302965
20.9%
9 698227
 
11.2%
7 528982
 
8.5%
5 364870
 
5.8%
6 311333
 
5.0%
2 311221
 
5.0%
0 297843
 
4.8%
4 286353
 
4.6%
3 269326
 
4.3%
Other Punctuation
ValueCountFrequency (%)
, 1573893
74.7%
. 387974
 
18.4%
& 136412
 
6.5%
' 9761
 
0.5%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4841572
100.0%
Close Punctuation
ValueCountFrequency (%)
) 713780
100.0%
Open Punctuation
ValueCountFrequency (%)
( 713780
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25881
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41540328
73.9%
Common 14641474
 
26.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4939373
 
11.9%
i 3725884
 
9.0%
e 3410133
 
8.2%
r 2844813
 
6.8%
s 2669041
 
6.4%
o 2472444
 
6.0%
l 2451221
 
5.9%
n 2432205
 
5.9%
t 1939529
 
4.7%
u 1855689
 
4.5%
Other values (87) 12799996
30.8%
Common
ValueCountFrequency (%)
4841572
33.1%
1 1867300
 
12.8%
, 1573893
 
10.7%
8 1302965
 
8.9%
) 713780
 
4.9%
( 713780
 
4.9%
9 698227
 
4.8%
7 528982
 
3.6%
. 387974
 
2.6%
5 364870
 
2.5%
Other values (9) 1648131
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56056544
99.8%
None 125258
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4939373
 
8.8%
4841572
 
8.6%
i 3725884
 
6.6%
e 3410133
 
6.1%
r 2844813
 
5.1%
s 2669041
 
4.8%
o 2472444
 
4.4%
l 2451221
 
4.4%
n 2432205
 
4.3%
t 1939529
 
3.5%
Other values (61) 24330329
43.4%
None
ValueCountFrequency (%)
ü 33290
26.6%
ö 25666
20.5%
è 21744
17.4%
é 20812
16.6%
ø 8502
 
6.8%
å 4702
 
3.8%
Ö 4493
 
3.6%
á 1535
 
1.2%
ä 1285
 
1.0%
í 836
 
0.7%
Other values (35) 2393
 
1.9%

acceptedNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:58.160949image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSPECIES
2nd rowSPECIES
ValueCountFrequency (%)
species 2
100.0%
2025-01-08T17:52:58.244171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 14
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 14
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 4
28.6%
E 4
28.6%
P 2
14.3%
C 2
14.3%
I 2
14.3%

parentNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:58.286173image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGEOLocate
ValueCountFrequency (%)
geolocate 1
100.0%
2025-01-08T17:52:58.371475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 1
11.1%
E 1
11.1%
O 1
11.1%
L 1
11.1%
o 1
11.1%
c 1
11.1%
a 1
11.1%
t 1
11.1%
e 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
55.6%
Uppercase Letter 4
44.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
20.0%
c 1
20.0%
a 1
20.0%
t 1
20.0%
e 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
G 1
25.0%
E 1
25.0%
O 1
25.0%
L 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 1
11.1%
E 1
11.1%
O 1
11.1%
L 1
11.1%
o 1
11.1%
c 1
11.1%
a 1
11.1%
t 1
11.1%
e 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 1
11.1%
E 1
11.1%
O 1
11.1%
L 1
11.1%
o 1
11.1%
c 1
11.1%
a 1
11.1%
t 1
11.1%
e 1
11.1%

namePublishedIn
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:52:58.410475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters16
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowACCEPTED
ValueCountFrequency (%)
accepted 2
100.0%
2025-01-08T17:52:58.494070image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 16
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 16
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 4
25.0%
E 4
25.0%
A 2
12.5%
P 2
12.5%
T 2
12.5%
D 2
12.5%
Distinct4354
Distinct (%)0.2%
Missing469
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:52:58.631418image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length134
Median length117
Mean length62.96739176
Min length7

Characters and Unicode

Total characters121270411
Distinct characters60
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique586 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Porifera, Calcarea
2nd rowAnimalia, Mollusca, Gastropoda, Bullidae
3rd rowAnimalia, Cnidaria, Anthozoa, Hexacorallia, Antipatharia, Stylopathidae
4th rowAnimalia, Echinodermata, Ophiuroidea, Ophiurida, Ophiotrichidae
5th rowAnimalia, Mollusca, Gastropoda, Cypraeidae
ValueCountFrequency (%)
animalia 1922044
 
18.1%
mollusca 866407
 
8.1%
gastropoda 612759
 
5.8%
arthropoda 390750
 
3.7%
crustacea 385110
 
3.6%
malacostraca 301975
 
2.8%
eumalacostraca 294895
 
2.8%
annelida 241801
 
2.3%
polychaeta 212969
 
2.0%
bivalvia 207685
 
2.0%
Other values (4342) 5202802
48.9%
2025-01-08T17:52:58.842100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19360559
16.0%
i 10629446
 
8.8%
8713273
 
7.2%
, 8691731
 
7.2%
o 7923783
 
6.5%
l 7526240
 
6.2%
e 6162876
 
5.1%
d 5675251
 
4.7%
r 5612652
 
4.6%
c 5023755
 
4.1%
Other values (50) 35950845
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 93247129
76.9%
Uppercase Letter 10617486
 
8.8%
Space Separator 8713273
 
7.2%
Other Punctuation 8691773
 
7.2%
Dash Punctuation 283
 
< 0.1%
Open Punctuation 169
 
< 0.1%
Close Punctuation 169
 
< 0.1%
Connector Punctuation 126
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19360559
20.8%
i 10629446
11.4%
o 7923783
8.5%
l 7526240
 
8.1%
e 6162876
 
6.6%
d 5675251
 
6.1%
r 5612652
 
6.0%
c 5023755
 
5.4%
n 4723925
 
5.1%
t 4393622
 
4.7%
Other values (16) 16215020
17.4%
Uppercase Letter
ValueCountFrequency (%)
A 2993701
28.2%
M 1365751
12.9%
C 1145094
 
10.8%
P 1046030
 
9.9%
E 846110
 
8.0%
G 714699
 
6.7%
S 488625
 
4.6%
D 335052
 
3.2%
B 296981
 
2.8%
T 261608
 
2.5%
Other values (15) 1123835
 
10.6%
Other Punctuation
ValueCountFrequency (%)
, 8691731
> 99.9%
. 28
 
< 0.1%
? 14
 
< 0.1%
Space Separator
ValueCountFrequency (%)
8713273
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 283
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 169
100.0%
Close Punctuation
ValueCountFrequency (%)
] 169
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 126
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 103864615
85.6%
Common 17405796
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19360559
18.6%
i 10629446
 
10.2%
o 7923783
 
7.6%
l 7526240
 
7.2%
e 6162876
 
5.9%
d 5675251
 
5.5%
r 5612652
 
5.4%
c 5023755
 
4.8%
n 4723925
 
4.5%
t 4393622
 
4.2%
Other values (41) 26832506
25.8%
Common
ValueCountFrequency (%)
8713273
50.1%
, 8691731
49.9%
- 283
 
< 0.1%
[ 169
 
< 0.1%
] 169
 
< 0.1%
_ 126
 
< 0.1%
. 28
 
< 0.1%
? 14
 
< 0.1%
+ 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121270411
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 19360559
16.0%
i 10629446
 
8.8%
8713273
 
7.2%
, 8691731
 
7.2%
o 7923783
 
6.5%
l 7526240
 
6.2%
e 6162876
 
5.1%
d 5675251
 
4.7%
r 5612652
 
4.6%
c 5023755
 
4.1%
Other values (50) 35950845
29.6%
Distinct6
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:52:58.899101image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length8
Mean length8.007927786
Min length8

Characters and Unicode

Total characters15426384
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 1920497
99.6%
chromista 2826
 
0.1%
incertae 2065
 
0.1%
sedis 2065
 
0.1%
protozoa 964
 
< 0.1%
bacteria 35
 
< 0.1%
821cc27a-e3bb-4bc5-ac34-89ada245069d 2
 
< 0.1%
2025-01-08T17:52:58.993591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3847985
24.9%
a 3846927
24.9%
m 1923323
12.5%
n 1922562
12.5%
A 1920497
12.4%
l 1920497
12.4%
s 6956
 
< 0.1%
e 6232
 
< 0.1%
r 5890
 
< 0.1%
t 5890
 
< 0.1%
Other values (21) 19625
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13499953
87.5%
Uppercase Letter 1924322
 
12.5%
Space Separator 2065
 
< 0.1%
Decimal Number 36
 
< 0.1%
Dash Punctuation 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3847985
28.5%
a 3846927
28.5%
m 1923323
14.2%
n 1922562
14.2%
l 1920497
14.2%
s 6956
 
0.1%
e 6232
 
< 0.1%
r 5890
 
< 0.1%
t 5890
 
< 0.1%
o 5718
 
< 0.1%
Other values (5) 7973
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 6
16.7%
4 6
16.7%
8 4
11.1%
3 4
11.1%
5 4
11.1%
9 4
11.1%
1 2
 
5.6%
7 2
 
5.6%
0 2
 
5.6%
6 2
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
A 1920497
99.8%
C 2826
 
0.1%
P 964
 
0.1%
B 35
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2065
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15424275
> 99.9%
Common 2109
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3847985
24.9%
a 3846927
24.9%
m 1923323
12.5%
n 1922562
12.5%
A 1920497
12.5%
l 1920497
12.5%
s 6956
 
< 0.1%
e 6232
 
< 0.1%
r 5890
 
< 0.1%
t 5890
 
< 0.1%
Other values (9) 17516
 
0.1%
Common
ValueCountFrequency (%)
2065
97.9%
- 8
 
0.4%
2 6
 
0.3%
4 6
 
0.3%
8 4
 
0.2%
3 4
 
0.2%
5 4
 
0.2%
9 4
 
0.2%
1 2
 
0.1%
7 2
 
0.1%
Other values (2) 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15426384
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3847985
24.9%
a 3846927
24.9%
m 1923323
12.5%
n 1922562
12.5%
A 1920497
12.4%
l 1920497
12.4%
s 6956
 
< 0.1%
e 6232
 
< 0.1%
r 5890
 
< 0.1%
t 5890
 
< 0.1%
Other values (21) 19625
 
0.1%

phylum
Text

Distinct52
Distinct (%)< 0.1%
Missing3160
Missing (%)0.2%
Memory size14.7 MiB
2025-01-08T17:52:59.047765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length8
Mean length8.850655641
Min length2

Characters and Unicode

Total characters17021873
Distinct characters40
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowPorifera
2nd rowMollusca
3rd rowCnidaria
4th rowEchinodermata
5th rowMollusca
ValueCountFrequency (%)
mollusca 864192
44.9%
arthropoda 392999
20.4%
annelida 241615
 
12.6%
cnidaria 117703
 
6.1%
echinodermata 91212
 
4.7%
nematoda 68758
 
3.6%
platyhelminthes 45840
 
2.4%
porifera 32733
 
1.7%
chordata 19745
 
1.0%
sipuncula 10415
 
0.5%
Other values (42) 38021
 
2.0%
2025-01-08T17:52:59.162191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2238532
13.2%
l 2078076
12.2%
o 1907515
11.2%
r 1110893
 
6.5%
c 984617
 
5.8%
d 936895
 
5.5%
s 910659
 
5.3%
u 885746
 
5.2%
M 866329
 
5.1%
n 769092
 
4.5%
Other values (30) 4333519
25.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15098638
88.7%
Uppercase Letter 1923235
 
11.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2238532
14.8%
l 2078076
13.8%
o 1907515
12.6%
r 1110893
7.4%
c 984617
 
6.5%
d 936895
 
6.2%
s 910659
 
6.0%
u 885746
 
5.9%
n 769092
 
5.1%
t 683305
 
4.5%
Other values (11) 2593308
17.2%
Uppercase Letter
ValueCountFrequency (%)
M 866329
45.0%
A 639230
33.2%
C 140451
 
7.3%
E 91483
 
4.8%
P 79547
 
4.1%
N 75120
 
3.9%
S 10431
 
0.5%
B 9994
 
0.5%
K 6389
 
0.3%
H 2144
 
0.1%
Other values (9) 2117
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 17021873
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2238532
13.2%
l 2078076
12.2%
o 1907515
11.2%
r 1110893
 
6.5%
c 984617
 
5.8%
d 936895
 
5.5%
s 910659
 
5.3%
u 885746
 
5.2%
M 866329
 
5.1%
n 769092
 
4.5%
Other values (30) 4333519
25.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17021873
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2238532
13.2%
l 2078076
12.2%
o 1907515
11.2%
r 1110893
 
6.5%
c 984617
 
5.8%
d 936895
 
5.5%
s 910659
 
5.3%
u 885746
 
5.2%
M 866329
 
5.1%
n 769092
 
4.5%
Other values (30) 4333519
25.5%

class
Text

Missing 

Distinct116
Distinct (%)< 0.1%
Missing66157
Missing (%)3.4%
Memory size14.7 MiB
2025-01-08T17:52:59.246217image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length19
Mean length10.05340075
Min length4

Characters and Unicode

Total characters18701698
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowCalcarea
2nd rowGastropoda
3rd rowAnthozoa
4th rowOphiuroidea
5th rowGastropoda
ValueCountFrequency (%)
gastropoda 610123
32.8%
malacostraca 301912
16.2%
polychaeta 211086
 
11.3%
bivalvia 207854
 
11.2%
anthozoa 93050
 
5.0%
copepoda 46190
 
2.5%
chromadorea 42750
 
2.3%
clitellata 30336
 
1.6%
ophiuroidea 27087
 
1.5%
asteroidea 25635
 
1.4%
Other values (106) 264213
14.2%
2025-01-08T17:52:59.398475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4042336
21.6%
o 2534615
13.6%
t 1401870
 
7.5%
r 1169735
 
6.3%
s 1022238
 
5.5%
d 956343
 
5.1%
c 944030
 
5.0%
l 924665
 
4.9%
p 848962
 
4.5%
i 703184
 
3.8%
Other values (44) 4153720
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16841416
90.1%
Uppercase Letter 1860238
 
9.9%
Decimal Number 34
 
< 0.1%
Other Punctuation 6
 
< 0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4042336
24.0%
o 2534615
15.0%
t 1401870
 
8.3%
r 1169735
 
6.9%
s 1022238
 
6.1%
d 956343
 
5.7%
c 944030
 
5.6%
l 924665
 
5.5%
p 848962
 
5.0%
i 703184
 
4.2%
Other values (14) 2293438
13.6%
Uppercase Letter
ValueCountFrequency (%)
G 617920
33.2%
M 317072
17.0%
P 232221
 
12.5%
B 211394
 
11.4%
C 168535
 
9.1%
A 139771
 
7.5%
O 50239
 
2.7%
H 37453
 
2.0%
T 25601
 
1.4%
E 22993
 
1.2%
Other values (9) 37039
 
2.0%
Decimal Number
ValueCountFrequency (%)
2 10
29.4%
1 7
20.6%
0 5
14.7%
4 3
 
8.8%
3 3
 
8.8%
5 3
 
8.8%
8 2
 
5.9%
9 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
: 4
66.7%
. 2
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18701654
> 99.9%
Common 44
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4042336
21.6%
o 2534615
13.6%
t 1401870
 
7.5%
r 1169735
 
6.3%
s 1022238
 
5.5%
d 956343
 
5.1%
c 944030
 
5.0%
l 924665
 
4.9%
p 848962
 
4.5%
i 703184
 
3.8%
Other values (33) 4153676
22.2%
Common
ValueCountFrequency (%)
2 10
22.7%
1 7
15.9%
0 5
11.4%
- 4
 
9.1%
: 4
 
9.1%
4 3
 
6.8%
3 3
 
6.8%
5 3
 
6.8%
. 2
 
4.5%
8 2
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18701698
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4042336
21.6%
o 2534615
13.6%
t 1401870
 
7.5%
r 1169735
 
6.3%
s 1022238
 
5.5%
d 956343
 
5.1%
c 944030
 
5.0%
l 924665
 
4.9%
p 848962
 
4.5%
i 703184
 
3.8%
Other values (44) 4153720
22.2%

order
Text

Missing 

Distinct414
Distinct (%)< 0.1%
Missing329537
Missing (%)17.1%
Memory size14.7 MiB
2025-01-08T17:52:59.530070image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length20
Mean length11.19175304
Min length5

Characters and Unicode

Total characters17871618
Distinct characters46
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)< 0.1%

Sample

1st rowLeucosolenida
2nd rowCephalaspidea
3rd rowAntipatharia
4th rowAmphilepidida
5th rowLittorinimorpha
ValueCountFrequency (%)
decapoda 196384
 
12.3%
neogastropoda 156428
 
9.8%
stylommatophora 116401
 
7.3%
littorinimorpha 113553
 
7.1%
phyllodocida 69439
 
4.3%
scleractinia 54200
 
3.4%
amphipoda 49533
 
3.1%
rhabditida 35176
 
2.2%
venerida 31275
 
2.0%
cardiida 30439
 
1.9%
Other values (404) 744028
46.6%
2025-01-08T17:52:59.725030image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2716551
15.2%
o 2130021
11.9%
i 1739041
 
9.7%
d 1413506
 
7.9%
t 1052242
 
5.9%
p 961716
 
5.4%
r 907952
 
5.1%
e 872746
 
4.9%
c 825635
 
4.6%
l 796133
 
4.5%
Other values (36) 4456075
24.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16274762
91.1%
Uppercase Letter 1596856
 
8.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2716551
16.7%
o 2130021
13.1%
i 1739041
10.7%
d 1413506
8.7%
t 1052242
 
6.5%
p 961716
 
5.9%
r 907952
 
5.6%
e 872746
 
5.4%
c 825635
 
5.1%
l 796133
 
4.9%
Other values (14) 2859219
17.6%
Uppercase Letter
ValueCountFrequency (%)
S 253706
15.9%
D 219372
13.7%
N 170936
10.7%
C 159863
10.0%
P 151094
9.5%
L 149326
9.4%
A 130604
8.2%
E 54026
 
3.4%
M 49031
 
3.1%
T 43142
 
2.7%
Other values (12) 215756
13.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 17871618
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2716551
15.2%
o 2130021
11.9%
i 1739041
 
9.7%
d 1413506
 
7.9%
t 1052242
 
5.9%
p 961716
 
5.4%
r 907952
 
5.1%
e 872746
 
4.9%
c 825635
 
4.6%
l 796133
 
4.5%
Other values (36) 4456075
24.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17871618
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2716551
15.2%
o 2130021
11.9%
i 1739041
 
9.7%
d 1413506
 
7.9%
t 1052242
 
5.9%
p 961716
 
5.4%
r 907952
 
5.1%
e 872746
 
4.9%
c 825635
 
4.6%
l 796133
 
4.5%
Other values (36) 4456075
24.9%

family
Text

Missing 

Distinct3522
Distinct (%)0.2%
Missing144488
Missing (%)7.5%
Memory size14.7 MiB
2025-01-08T17:52:59.898611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length21
Mean length11.20729837
Min length6

Characters and Unicode

Total characters19970341
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique272 ?
Unique (%)< 0.1%

Sample

1st rowSyconidae
2nd rowBullidae
3rd rowStylopathidae
4th rowOphiotrichidae
5th rowCypraeidae
ValueCountFrequency (%)
cambaridae 28956
 
1.6%
conidae 28425
 
1.6%
unionidae 26787
 
1.5%
muricidae 22783
 
1.3%
veneridae 18640
 
1.0%
cypraeidae 16831
 
0.9%
cerithiidae 16777
 
0.9%
spionidae 15856
 
0.9%
syllidae 14115
 
0.8%
pectinidae 12961
 
0.7%
Other values (3512) 1579774
88.7%
2025-01-08T17:53:00.140475image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2971768
14.9%
a 2739603
13.7%
e 2656666
13.3%
d 2019505
10.1%
o 1034346
 
5.2%
l 1016313
 
5.1%
r 1015168
 
5.1%
n 842574
 
4.2%
t 674978
 
3.4%
c 545975
 
2.7%
Other values (42) 4453445
22.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18188436
91.1%
Uppercase Letter 1781905
 
8.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2971768
16.3%
a 2739603
15.1%
e 2656666
14.6%
d 2019505
11.1%
o 1034346
 
5.7%
l 1016313
 
5.6%
r 1015168
 
5.6%
n 842574
 
4.6%
t 674978
 
3.7%
c 545975
 
3.0%
Other values (16) 2671540
14.7%
Uppercase Letter
ValueCountFrequency (%)
C 300023
16.8%
P 268312
15.1%
A 153050
8.6%
S 152792
8.6%
M 116985
 
6.6%
T 108088
 
6.1%
L 87632
 
4.9%
O 80760
 
4.5%
E 66802
 
3.7%
N 66686
 
3.7%
Other values (16) 380775
21.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 19970341
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2971768
14.9%
a 2739603
13.7%
e 2656666
13.3%
d 2019505
10.1%
o 1034346
 
5.2%
l 1016313
 
5.1%
r 1015168
 
5.1%
n 842574
 
4.2%
t 674978
 
3.4%
c 545975
 
2.7%
Other values (42) 4453445
22.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19970341
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2971768
14.9%
a 2739603
13.7%
e 2656666
13.3%
d 2019505
10.1%
o 1034346
 
5.2%
l 1016313
 
5.1%
r 1015168
 
5.1%
n 842574
 
4.2%
t 674978
 
3.4%
c 545975
 
2.7%
Other values (42) 4453445
22.3%

subtribe
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:00.216563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length130
Median length89
Mean length89
Min length48

Characters and Unicode

Total characters178
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_DERIVED_FROM_COORDINATES;CONTINENT_INVALID
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 1
50.0%
occurrence_status_inferred_from_individual_count 1
50.0%
2025-01-08T17:53:00.324329image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 17
9.6%
E 16
 
9.0%
N 16
 
9.0%
I 15
 
8.4%
T 13
 
7.3%
D 13
 
7.3%
R 13
 
7.3%
C 12
 
6.7%
O 12
 
6.7%
U 10
 
5.6%
Other values (11) 41
23.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 156
87.6%
Connector Punctuation 17
 
9.6%
Other Punctuation 3
 
1.7%
Decimal Number 2
 
1.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 16
10.3%
N 16
10.3%
I 15
9.6%
T 13
8.3%
D 13
8.3%
R 13
8.3%
C 12
7.7%
O 12
7.7%
U 10
 
6.4%
A 8
 
5.1%
Other values (7) 28
17.9%
Decimal Number
ValueCountFrequency (%)
8 1
50.0%
4 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 17
100.0%
Other Punctuation
ValueCountFrequency (%)
; 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 156
87.6%
Common 22
 
12.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 16
10.3%
N 16
10.3%
I 15
9.6%
T 13
8.3%
D 13
8.3%
R 13
8.3%
C 12
7.7%
O 12
7.7%
U 10
 
6.4%
A 8
 
5.1%
Other values (7) 28
17.9%
Common
ValueCountFrequency (%)
_ 17
77.3%
; 3
 
13.6%
8 1
 
4.5%
4 1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 178
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 17
9.6%
E 16
 
9.0%
N 16
 
9.0%
I 15
 
8.4%
T 13
 
7.3%
D 13
 
7.3%
R 13
 
7.3%
C 12
 
6.7%
O 12
 
6.7%
U 10
 
5.6%
Other values (11) 41
23.0%

genus
Text

Missing 

Distinct20787
Distinct (%)1.3%
Missing358044
Missing (%)18.6%
Memory size14.7 MiB
2025-01-08T17:53:00.507333image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length23
Mean length9.482777111
Min length2

Characters and Unicode

Total characters14872304
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3152 ?
Unique (%)0.2%

Sample

1st rowSycon
2nd rowBulla
3rd rowStylopathes
4th rowOphiothrix
5th rowNaria
ValueCountFrequency (%)
conus 22884
 
1.5%
cerithium 8956
 
0.6%
cambarus 8948
 
0.6%
faxonius 8189
 
0.5%
procambarus 8096
 
0.5%
aricidea 5223
 
0.3%
nerita 4536
 
0.3%
nassarius 4534
 
0.3%
pagurus 4234
 
0.3%
elimia 4085
 
0.3%
Other values (20777) 1488664
94.9%
2025-01-08T17:53:00.756719image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1794417
 
12.1%
i 1296042
 
8.7%
o 1190619
 
8.0%
e 1030226
 
6.9%
r 967232
 
6.5%
l 958555
 
6.4%
s 949415
 
6.4%
n 726098
 
4.9%
t 714218
 
4.8%
u 705352
 
4.7%
Other values (42) 4540130
30.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13303955
89.5%
Uppercase Letter 1568349
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1794417
13.5%
i 1296042
9.7%
o 1190619
8.9%
e 1030226
 
7.7%
r 967232
 
7.3%
l 958555
 
7.2%
s 949415
 
7.1%
n 726098
 
5.5%
t 714218
 
5.4%
u 705352
 
5.3%
Other values (16) 2971781
22.3%
Uppercase Letter
ValueCountFrequency (%)
P 224909
14.3%
C 211243
13.5%
A 159037
10.1%
S 120812
 
7.7%
M 103354
 
6.6%
L 96699
 
6.2%
E 88844
 
5.7%
T 88317
 
5.6%
O 65295
 
4.2%
H 63671
 
4.1%
Other values (16) 346168
22.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 14872304
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1794417
 
12.1%
i 1296042
 
8.7%
o 1190619
 
8.0%
e 1030226
 
6.9%
r 967232
 
6.5%
l 958555
 
6.4%
s 949415
 
6.4%
n 726098
 
4.9%
t 714218
 
4.8%
u 705352
 
4.7%
Other values (42) 4540130
30.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14872304
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1794417
 
12.1%
i 1296042
 
8.7%
o 1190619
 
8.0%
e 1030226
 
6.9%
r 967232
 
6.5%
l 958555
 
6.4%
s 949415
 
6.4%
n 726098
 
4.9%
t 714218
 
4.8%
u 705352
 
4.7%
Other values (42) 4540130
30.5%

genericName
Text

Missing 

Distinct21084
Distinct (%)1.3%
Missing358043
Missing (%)18.6%
Memory size14.7 MiB
2025-01-08T17:53:00.940739image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length23
Mean length9.309154844
Min length1

Characters and Unicode

Total characters14600013
Distinct characters54
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3830 ?
Unique (%)0.2%

Sample

1st rowScypha
2nd rowBulla
3rd rowStylopathes
4th rowOphiothrix
5th rowCypraea
ValueCountFrequency (%)
conus 24156
 
1.5%
cypraea 15390
 
1.0%
cambarus 10146
 
0.6%
cerithium 9393
 
0.6%
orconectes 8661
 
0.6%
procambarus 8047
 
0.5%
nassarius 6727
 
0.4%
lumbrineris 4967
 
0.3%
terebra 4662
 
0.3%
aricidea 4572
 
0.3%
Other values (21074) 1471629
93.8%
2025-01-08T17:53:01.183810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1744079
 
11.9%
i 1263792
 
8.7%
o 1156021
 
7.9%
e 1016938
 
7.0%
r 967987
 
6.6%
s 938349
 
6.4%
l 915577
 
6.3%
t 706525
 
4.8%
n 704068
 
4.8%
u 686498
 
4.7%
Other values (44) 4500179
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13031665
89.3%
Uppercase Letter 1568347
 
10.7%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1744079
13.4%
i 1263792
9.7%
o 1156021
 
8.9%
e 1016938
 
7.8%
r 967987
 
7.4%
s 938349
 
7.2%
l 915577
 
7.0%
t 706525
 
5.4%
n 704068
 
5.4%
u 686498
 
5.3%
Other values (17) 2931831
22.5%
Uppercase Letter
ValueCountFrequency (%)
C 229066
14.6%
P 219840
14.0%
A 154961
9.9%
S 126006
 
8.0%
M 103229
 
6.6%
T 96207
 
6.1%
L 90921
 
5.8%
E 82417
 
5.3%
O 74791
 
4.8%
H 62524
 
4.0%
Other values (16) 328385
20.9%
Other Punctuation
ValueCountFrequency (%)
? 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14600012
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1744079
 
11.9%
i 1263792
 
8.7%
o 1156021
 
7.9%
e 1016938
 
7.0%
r 967987
 
6.6%
s 938349
 
6.4%
l 915577
 
6.3%
t 706525
 
4.8%
n 704068
 
4.8%
u 686498
 
4.7%
Other values (43) 4500178
30.8%
Common
ValueCountFrequency (%)
? 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14600012
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1744079
 
11.9%
i 1263792
 
8.7%
o 1156021
 
7.9%
e 1016938
 
7.0%
r 967987
 
6.6%
s 938349
 
6.4%
l 915577
 
6.3%
t 706525
 
4.8%
n 704068
 
4.8%
u 686498
 
4.7%
Other values (43) 4500178
30.8%
None
ValueCountFrequency (%)
ö 1
100.0%

subgenus
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:01.233731image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters10
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
ValueCountFrequency (%)
false 2
100.0%
2025-01-08T17:53:01.321002image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 2
20.0%
a 2
20.0%
l 2
20.0%
s 2
20.0%
e 2
20.0%

infragenericEpithet
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:01.363119image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row6482728
2nd row2504455
ValueCountFrequency (%)
6482728 1
50.0%
2504455 1
50.0%
2025-01-08T17:53:01.451432image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

specificEpithet
Text

Missing 

Distinct39412
Distinct (%)3.0%
Missing626798
Missing (%)32.5%
Memory size14.7 MiB
2025-01-08T17:53:01.604232image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length8.507768189
Min length2

Characters and Unicode

Total characters11056653
Distinct characters38
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9920 ?
Unique (%)0.8%

Sample

1st rowstriata
2nd rowcolumnaris
3rd rowsuensonii
4th rowlabrolineata
5th rowheteractis
ValueCountFrequency (%)
gracilis 6098
 
0.5%
fragilis 3477
 
0.3%
affinis 3341
 
0.3%
elegans 3182
 
0.2%
aculeata 3066
 
0.2%
borealis 2967
 
0.2%
americanus 2637
 
0.2%
grandis 2519
 
0.2%
acutus 2312
 
0.2%
tenuis 2265
 
0.2%
Other values (39402) 1267731
97.5%
2025-01-08T17:53:01.820162image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1553197
14.0%
i 1250540
11.3%
s 956883
 
8.7%
e 779958
 
7.1%
r 771552
 
7.0%
t 706671
 
6.4%
u 704699
 
6.4%
n 690520
 
6.2%
l 660182
 
6.0%
c 552656
 
5.0%
Other values (28) 2429795
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11056627
> 99.9%
Decimal Number 14
 
< 0.1%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1553197
14.0%
i 1250540
11.3%
s 956883
 
8.7%
e 779958
 
7.1%
r 771552
 
7.0%
t 706671
 
6.4%
u 704699
 
6.4%
n 690520
 
6.2%
l 660182
 
6.0%
c 552656
 
5.0%
Other values (20) 2429769
22.0%
Decimal Number
ValueCountFrequency (%)
2 3
21.4%
5 3
21.4%
4 3
21.4%
8 2
14.3%
0 1
 
7.1%
6 1
 
7.1%
7 1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11056627
> 99.9%
Common 26
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1553197
14.0%
i 1250540
11.3%
s 956883
 
8.7%
e 779958
 
7.1%
r 771552
 
7.0%
t 706671
 
6.4%
u 704699
 
6.4%
n 690520
 
6.2%
l 660182
 
6.0%
c 552656
 
5.0%
Other values (20) 2429769
22.0%
Common
ValueCountFrequency (%)
- 12
46.2%
2 3
 
11.5%
5 3
 
11.5%
4 3
 
11.5%
8 2
 
7.7%
0 1
 
3.8%
6 1
 
3.8%
7 1
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11056153
> 99.9%
None 500
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1553197
14.0%
i 1250540
11.3%
s 956883
 
8.7%
e 779958
 
7.1%
r 771552
 
7.0%
t 706671
 
6.4%
u 704699
 
6.4%
n 690520
 
6.2%
l 660182
 
6.0%
c 552656
 
5.0%
Other values (24) 2429295
22.0%
None
ValueCountFrequency (%)
ü 308
61.6%
ö 117
 
23.4%
ë 73
 
14.6%
ä 2
 
0.4%

infraspecificEpithet
Text

Missing 

Distinct3653
Distinct (%)10.1%
Missing1890289
Missing (%)98.1%
Memory size14.7 MiB
2025-01-08T17:53:01.968152image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length8.605777753
Min length1

Characters and Unicode

Total characters310703
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1259 ?
Unique (%)3.5%

Sample

1st rowconnectens
2nd rowlaevis
3rd rowschizodontia
4th rowantarctica
5th rowsayi
ValueCountFrequency (%)
acutus 1011
 
2.8%
radiata 616
 
1.7%
bartonii 521
 
1.4%
gibbosus 501
 
1.4%
appressa 443
 
1.2%
campanulatum 379
 
1.0%
longimanus 359
 
1.0%
carinata 350
 
1.0%
floridana 283
 
0.8%
trivolvis 273
 
0.8%
Other values (3643) 31368
86.9%
2025-01-08T17:53:02.189182image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 45988
14.8%
i 33598
10.8%
s 29641
9.5%
e 22986
 
7.4%
n 22086
 
7.1%
u 19813
 
6.4%
r 19186
 
6.2%
t 17670
 
5.7%
l 16838
 
5.4%
c 16647
 
5.4%
Other values (19) 66250
21.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 310700
> 99.9%
Decimal Number 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 45988
14.8%
i 33598
10.8%
s 29641
9.5%
e 22986
 
7.4%
n 22086
 
7.1%
u 19813
 
6.4%
r 19186
 
6.2%
t 17670
 
5.7%
l 16838
 
5.4%
c 16647
 
5.4%
Other values (17) 66247
21.3%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 310700
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 45988
14.8%
i 33598
10.8%
s 29641
9.5%
e 22986
 
7.4%
n 22086
 
7.1%
u 19813
 
6.4%
r 19186
 
6.2%
t 17670
 
5.7%
l 16838
 
5.4%
c 16647
 
5.4%
Other values (17) 66247
21.3%
Common
ValueCountFrequency (%)
1 2
66.7%
- 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 310683
> 99.9%
None 20
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 45988
14.8%
i 33598
10.8%
s 29641
9.5%
e 22986
 
7.4%
n 22086
 
7.1%
u 19813
 
6.4%
r 19186
 
6.2%
t 17670
 
5.7%
l 16838
 
5.4%
c 16647
 
5.4%
Other values (18) 66230
21.3%
None
ValueCountFrequency (%)
ö 20
100.0%

cultivarEpithet
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:02.239863image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row108
2nd row108
ValueCountFrequency (%)
108 2
100.0%
2025-01-08T17:53:02.324293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
33.3%
0 2
33.3%
8 2
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
33.3%
0 2
33.3%
8 2
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
33.3%
0 2
33.3%
8 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
33.3%
0 2
33.3%
8 2
33.3%
Distinct14
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:02.371294image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.539588038
Min length3

Characters and Unicode

Total characters12597797
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowGENUS
2nd rowSPECIES
3rd rowSPECIES
4th rowSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 1263491
65.6%
genus 268755
 
14.0%
family 216656
 
11.2%
class 63569
 
3.3%
phylum 48164
 
2.5%
subspecies 32829
 
1.7%
order 26813
 
1.4%
kingdom 2836
 
0.1%
variety 2500
 
0.1%
form 773
 
< 0.1%
Other values (4) 4
 
< 0.1%
2025-01-08T17:53:02.479110image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 3021362
24.0%
E 2890709
22.9%
I 1518312
12.1%
C 1359889
10.8%
P 1344484
10.7%
U 349749
 
2.8%
L 328389
 
2.6%
A 282726
 
2.2%
N 271593
 
2.2%
G 271591
 
2.2%
Other values (19) 958993
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 12597784
> 99.9%
Decimal Number 13
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 3021362
24.0%
E 2890709
22.9%
I 1518312
12.1%
C 1359889
10.8%
P 1344484
10.7%
U 349749
 
2.8%
L 328389
 
2.6%
A 282726
 
2.2%
N 271593
 
2.2%
G 271591
 
2.2%
Other values (11) 958980
 
7.6%
Decimal Number
ValueCountFrequency (%)
4 4
30.8%
1 2
15.4%
5 2
15.4%
9 1
 
7.7%
6 1
 
7.7%
7 1
 
7.7%
8 1
 
7.7%
3 1
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 12597784
> 99.9%
Common 13
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 3021362
24.0%
E 2890709
22.9%
I 1518312
12.1%
C 1359889
10.8%
P 1344484
10.7%
U 349749
 
2.8%
L 328389
 
2.6%
A 282726
 
2.2%
N 271593
 
2.2%
G 271591
 
2.2%
Other values (11) 958980
 
7.6%
Common
ValueCountFrequency (%)
4 4
30.8%
1 2
15.4%
5 2
15.4%
9 1
 
7.7%
6 1
 
7.7%
7 1
 
7.7%
8 1
 
7.7%
3 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12597797
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 3021362
24.0%
E 2890709
22.9%
I 1518312
12.1%
C 1359889
10.8%
P 1344484
10.7%
U 349749
 
2.8%
L 328389
 
2.6%
A 282726
 
2.2%
N 271593
 
2.2%
G 271591
 
2.2%
Other values (19) 958993
 
7.6%

verbatimTaxonRank
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:02.524930image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row891
2nd row434
ValueCountFrequency (%)
891 1
50.0%
434 1
50.0%
2025-01-08T17:53:02.613367image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 2
33.3%
8 1
16.7%
9 1
16.7%
1 1
16.7%
3 1
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 2
33.3%
8 1
16.7%
9 1
16.7%
1 1
16.7%
3 1
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 2
33.3%
8 1
16.7%
9 1
16.7%
1 1
16.7%
3 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 2
33.3%
8 1
16.7%
9 1
16.7%
1 1
16.7%
3 1
16.7%

vernacularName
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:02.650910image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row5954
2nd row6426
ValueCountFrequency (%)
5954 1
50.0%
6426 1
50.0%
2025-01-08T17:53:02.735711image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 2
25.0%
4 2
25.0%
6 2
25.0%
9 1
12.5%
2 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 2
25.0%
4 2
25.0%
6 2
25.0%
9 1
12.5%
2 1
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 2
25.0%
4 2
25.0%
6 2
25.0%
9 1
12.5%
2 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 2
25.0%
4 2
25.0%
6 2
25.0%
9 1
12.5%
2 1
12.5%

nomenclaturalCode
Text

Missing 

Distinct4
Distinct (%)100.0%
Missing1926389
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:02.784712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length11
Min length7

Characters and Unicode

Total characters44
Distinct characters27
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)100.0%

Sample

1st row6482725
2nd rowVan Cleave, H. J.
3rd rowSchwartz, Ben
4th row2504454
ValueCountFrequency (%)
6482725 1
12.5%
van 1
12.5%
cleave 1
12.5%
h 1
12.5%
j 1
12.5%
schwartz 1
12.5%
ben 1
12.5%
2504454 1
12.5%
2025-01-08T17:53:02.882244image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
9.1%
4 4
 
9.1%
2 3
 
6.8%
5 3
 
6.8%
a 3
 
6.8%
e 3
 
6.8%
. 2
 
4.5%
, 2
 
4.5%
n 2
 
4.5%
w 1
 
2.3%
Other values (17) 17
38.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
36.4%
Decimal Number 14
31.8%
Uppercase Letter 6
 
13.6%
Space Separator 4
 
9.1%
Other Punctuation 4
 
9.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3
18.8%
e 3
18.8%
n 2
12.5%
w 1
 
6.2%
h 1
 
6.2%
r 1
 
6.2%
t 1
 
6.2%
z 1
 
6.2%
c 1
 
6.2%
v 1
 
6.2%
Decimal Number
ValueCountFrequency (%)
4 4
28.6%
2 3
21.4%
5 3
21.4%
6 1
 
7.1%
7 1
 
7.1%
8 1
 
7.1%
0 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
S 1
16.7%
B 1
16.7%
J 1
16.7%
H 1
16.7%
C 1
16.7%
V 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 2
50.0%
, 2
50.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22
50.0%
Latin 22
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3
13.6%
e 3
13.6%
n 2
 
9.1%
w 1
 
4.5%
h 1
 
4.5%
r 1
 
4.5%
S 1
 
4.5%
t 1
 
4.5%
z 1
 
4.5%
B 1
 
4.5%
Other values (7) 7
31.8%
Common
ValueCountFrequency (%)
4
18.2%
4 4
18.2%
2 3
13.6%
5 3
13.6%
. 2
9.1%
, 2
9.1%
6 1
 
4.5%
7 1
 
4.5%
8 1
 
4.5%
0 1
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4
 
9.1%
4 4
 
9.1%
2 3
 
6.8%
5 3
 
6.8%
a 3
 
6.8%
e 3
 
6.8%
. 2
 
4.5%
, 2
 
4.5%
n 2
 
4.5%
w 1
 
2.3%
Other values (17) 17
38.6%
Distinct3
Distinct (%)< 0.1%
Missing2071
Missing (%)0.1%
Memory size14.7 MiB
2025-01-08T17:53:02.925144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.818195707
Min length7

Characters and Unicode

Total characters15044726
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSYNONYM
2nd rowACCEPTED
3rd rowACCEPTED
4th rowACCEPTED
5th rowSYNONYM
ValueCountFrequency (%)
accepted 1560511
81.1%
synonym 349850
 
18.2%
doubtful 13961
 
0.7%
2025-01-08T17:53:03.147048image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 3121022
20.7%
E 3121022
20.7%
T 1574472
10.5%
D 1574472
10.5%
A 1560511
10.4%
P 1560511
10.4%
Y 699700
 
4.7%
N 699700
 
4.7%
O 363811
 
2.4%
S 349850
 
2.3%
Other values (5) 419655
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 15044726
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 3121022
20.7%
E 3121022
20.7%
T 1574472
10.5%
D 1574472
10.5%
A 1560511
10.4%
P 1560511
10.4%
Y 699700
 
4.7%
N 699700
 
4.7%
O 363811
 
2.4%
S 349850
 
2.3%
Other values (5) 419655
 
2.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 15044726
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 3121022
20.7%
E 3121022
20.7%
T 1574472
10.5%
D 1574472
10.5%
A 1560511
10.4%
P 1560511
10.4%
Y 699700
 
4.7%
N 699700
 
4.7%
O 363811
 
2.4%
S 349850
 
2.3%
Other values (5) 419655
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15044726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 3121022
20.7%
E 3121022
20.7%
T 1574472
10.5%
D 1574472
10.5%
A 1560511
10.4%
P 1560511
10.4%
Y 699700
 
4.7%
N 699700
 
4.7%
O 363811
 
2.4%
S 349850
 
2.3%
Other values (5) 419655
 
2.8%

nomenclaturalStatus
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926391
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:03.189072image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters14
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row6482728
2nd row2504455
ValueCountFrequency (%)
6482728 1
50.0%
2504455 1
50.0%
2025-01-08T17:53:03.274500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 3
21.4%
2 3
21.4%
5 3
21.4%
8 2
14.3%
6 1
 
7.1%
7 1
 
7.1%
0 1
 
7.1%

taxonRemarks
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing1926390
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:03.321500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length16.33333333
Min length8

Characters and Unicode

Total characters49
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowHemionchos striatus
2nd rowNematoda
3rd rowConspicuum icteridorum
ValueCountFrequency (%)
hemionchos 1
20.0%
striatus 1
20.0%
nematoda 1
20.0%
conspicuum 1
20.0%
icteridorum 1
20.0%
2025-01-08T17:53:03.428613image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 5
10.2%
o 5
10.2%
m 4
 
8.2%
s 4
 
8.2%
t 4
 
8.2%
u 4
 
8.2%
c 3
 
6.1%
e 3
 
6.1%
r 3
 
6.1%
a 3
 
6.1%
Other values (8) 11
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44
89.8%
Uppercase Letter 3
 
6.1%
Space Separator 2
 
4.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5
11.4%
o 5
11.4%
m 4
9.1%
s 4
9.1%
t 4
9.1%
u 4
9.1%
c 3
6.8%
e 3
6.8%
r 3
6.8%
a 3
6.8%
Other values (4) 6
13.6%
Uppercase Letter
ValueCountFrequency (%)
C 1
33.3%
H 1
33.3%
N 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 47
95.9%
Common 2
 
4.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 5
10.6%
o 5
10.6%
m 4
8.5%
s 4
8.5%
t 4
8.5%
u 4
8.5%
c 3
 
6.4%
e 3
 
6.4%
r 3
 
6.4%
a 3
 
6.4%
Other values (7) 9
19.1%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 5
10.2%
o 5
10.2%
m 4
 
8.2%
s 4
 
8.2%
t 4
 
8.2%
u 4
 
8.2%
c 3
 
6.1%
e 3
 
6.1%
r 3
 
6.1%
a 3
 
6.1%
Other values (8) 11
22.4%
Distinct3
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:03.487564image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length36
Mean length36.00000831
Min length36

Characters and Unicode

Total characters69350020
Distinct characters37
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 1926387
> 99.9%
2
 
< 0.1%
hemionchos 1
 
< 0.1%
striatus 1
 
< 0.1%
campbell 1
 
< 0.1%
beveridge 1
 
< 0.1%
2006 1
 
< 0.1%
conspicuum 1
 
< 0.1%
icteridorum 1
 
< 0.1%
denton 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
2025-01-08T17:53:03.596897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 7705551
11.1%
a 7705550
11.1%
- 7705548
11.1%
2 5779162
8.3%
b 5779162
8.3%
4 5779161
8.3%
d 3852777
 
5.6%
9 3852775
 
5.6%
5 3852775
 
5.6%
8 3852774
 
5.6%
Other values (27) 13484785
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34674974
50.0%
Lowercase Letter 26969478
38.9%
Dash Punctuation 7705548
 
11.1%
Space Separator 10
 
< 0.1%
Uppercase Letter 6
 
< 0.1%
Other Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 7705551
28.6%
a 7705550
28.6%
b 5779162
21.4%
d 3852777
14.3%
e 1926394
 
7.1%
i 6
 
< 0.1%
r 5
 
< 0.1%
o 5
 
< 0.1%
m 4
 
< 0.1%
n 4
 
< 0.1%
Other values (9) 20
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 5779162
16.7%
4 5779161
16.7%
9 3852775
11.1%
5 3852775
11.1%
8 3852774
11.1%
3 3852774
11.1%
1 1926389
 
5.6%
0 1926389
 
5.6%
6 1926388
 
5.6%
7 1926387
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
B 2
33.3%
C 2
33.3%
H 1
16.7%
D 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 2
50.0%
& 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 7705548
100.0%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42380536
61.1%
Latin 26969484
38.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 7705551
28.6%
a 7705550
28.6%
b 5779162
21.4%
d 3852777
14.3%
e 1926394
 
7.1%
i 6
 
< 0.1%
r 5
 
< 0.1%
o 5
 
< 0.1%
m 4
 
< 0.1%
n 4
 
< 0.1%
Other values (13) 26
 
< 0.1%
Common
ValueCountFrequency (%)
- 7705548
18.2%
2 5779162
13.6%
4 5779161
13.6%
9 3852775
9.1%
5 3852775
9.1%
8 3852774
9.1%
3 3852774
9.1%
1 1926389
 
4.5%
0 1926389
 
4.5%
6 1926388
 
4.5%
Other values (4) 1926401
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 69350020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 7705551
11.1%
a 7705550
11.1%
- 7705548
11.1%
2 5779162
8.3%
b 5779162
8.3%
4 5779161
8.3%
d 3852777
 
5.6%
9 3852775
 
5.6%
5 3852775
 
5.6%
8 3852774
 
5.6%
Other values (27) 13484785
19.4%
Distinct3
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:03.640393image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length2
Mean length2.000019207
Min length2

Characters and Unicode

Total characters3852815
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 1926387
> 99.9%
hemionchos 1
 
< 0.1%
striatus 1
 
< 0.1%
conspicuum 1
 
< 0.1%
icteridorum 1
 
< 0.1%
2025-01-08T17:53:03.740224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 1926387
50.0%
S 1926387
50.0%
i 5
 
< 0.1%
u 4
 
< 0.1%
o 4
 
< 0.1%
s 4
 
< 0.1%
m 3
 
< 0.1%
c 3
 
< 0.1%
t 3
 
< 0.1%
r 3
 
< 0.1%
Other values (9) 12
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3852776
> 99.9%
Lowercase Letter 37
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5
13.5%
u 4
10.8%
o 4
10.8%
s 4
10.8%
m 3
8.1%
c 3
8.1%
t 3
8.1%
r 3
8.1%
e 2
 
5.4%
n 2
 
5.4%
Other values (4) 4
10.8%
Uppercase Letter
ValueCountFrequency (%)
U 1926387
50.0%
S 1926387
50.0%
C 1
 
< 0.1%
H 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3852813
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 1926387
50.0%
S 1926387
50.0%
i 5
 
< 0.1%
u 4
 
< 0.1%
o 4
 
< 0.1%
s 4
 
< 0.1%
m 3
 
< 0.1%
c 3
 
< 0.1%
t 3
 
< 0.1%
r 3
 
< 0.1%
Other values (8) 10
 
< 0.1%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3852815
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 1926387
50.0%
S 1926387
50.0%
i 5
 
< 0.1%
u 4
 
< 0.1%
o 4
 
< 0.1%
s 4
 
< 0.1%
m 3
 
< 0.1%
c 3
 
< 0.1%
t 3
 
< 0.1%
r 3
 
< 0.1%
Other values (9) 12
 
< 0.1%
Distinct209948
Distinct (%)10.9%
Missing6
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:03.892354image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99591152
Min length20

Characters and Unicode

Total characters46225412
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9123 ?
Unique (%)0.5%

Sample

1st row2024-12-02T13:57:44.311Z
2nd row2024-12-02T13:57:20.485Z
3rd row2024-12-02T13:57:18.447Z
4th row2024-12-02T13:57:45.124Z
5th row2024-12-02T13:57:20.489Z
ValueCountFrequency (%)
2024-12-02t13:57:52.889z 37
 
< 0.1%
2024-12-02t13:57:28.783z 37
 
< 0.1%
2024-12-02t13:57:43.700z 36
 
< 0.1%
2024-12-02t13:57:40.815z 36
 
< 0.1%
2024-12-02t13:58:01.714z 36
 
< 0.1%
2024-12-02t13:57:53.093z 35
 
< 0.1%
2024-12-02t13:57:40.927z 35
 
< 0.1%
2024-12-02t13:57:30.406z 35
 
< 0.1%
2024-12-02t13:57:50.671z 35
 
< 0.1%
2024-12-02t13:57:33.269z 35
 
< 0.1%
Other values (209938) 1926030
> 99.9%
2025-01-08T17:53:04.107244image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 8796773
19.0%
0 4884695
10.6%
1 4858658
10.5%
- 3852774
8.3%
: 3852774
8.3%
4 3098095
 
6.7%
5 3058765
 
6.6%
3 3051121
 
6.6%
T 1926387
 
4.2%
Z 1926387
 
4.2%
Other values (5) 6918983
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 32742672
70.8%
Other Punctuation 5777192
 
12.5%
Dash Punctuation 3852774
 
8.3%
Uppercase Letter 3852774
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 8796773
26.9%
0 4884695
14.9%
1 4858658
14.8%
4 3098095
 
9.5%
5 3058765
 
9.3%
3 3051121
 
9.3%
7 1479601
 
4.5%
9 1231841
 
3.8%
6 1162885
 
3.6%
8 1120238
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 3852774
66.7%
. 1924418
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1926387
50.0%
Z 1926387
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 3852774
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42372638
91.7%
Latin 3852774
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 8796773
20.8%
0 4884695
11.5%
1 4858658
11.5%
- 3852774
9.1%
: 3852774
9.1%
4 3098095
 
7.3%
5 3058765
 
7.2%
3 3051121
 
7.2%
. 1924418
 
4.5%
7 1479601
 
3.5%
Other values (3) 3514964
 
8.3%
Latin
ValueCountFrequency (%)
T 1926387
50.0%
Z 1926387
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46225412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 8796773
19.0%
0 4884695
10.6%
1 4858658
10.5%
- 3852774
8.3%
: 3852774
8.3%
4 3098095
 
6.7%
5 3058765
 
6.6%
3 3051121
 
6.6%
T 1926387
 
4.2%
Z 1926387
 
4.2%
Other values (5) 6918983
15.0%

elevation
Text

Missing 

Distinct1093
Distinct (%)16.0%
Missing1919570
Missing (%)99.6%
Memory size14.7 MiB
2025-01-08T17:53:04.287613image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.361424593
Min length3

Characters and Unicode

Total characters36581
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique422 ?
Unique (%)6.2%

Sample

1st row783.0
2nd row15.0
3rd row160.0
4th row4070.0
5th row870.0
ValueCountFrequency (%)
1981.0 616
 
9.0%
160.0 207
 
3.0%
350.0 169
 
2.5%
348.0 125
 
1.8%
164.0 123
 
1.8%
149.0 117
 
1.7%
309.0 116
 
1.7%
388.0 86
 
1.3%
988.0 82
 
1.2%
1100.0 73
 
1.1%
Other values (1083) 5109
74.9%
2025-01-08T17:53:04.531731image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9340
25.5%
. 6821
18.6%
1 4970
13.6%
2 2421
 
6.6%
8 2381
 
6.5%
3 2089
 
5.7%
9 1985
 
5.4%
4 1733
 
4.7%
5 1722
 
4.7%
6 1618
 
4.4%
Other values (4) 1501
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29754
81.3%
Other Punctuation 6821
 
18.6%
Uppercase Letter 6
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9340
31.4%
1 4970
16.7%
2 2421
 
8.1%
8 2381
 
8.0%
3 2089
 
7.0%
9 1985
 
6.7%
4 1733
 
5.8%
5 1722
 
5.8%
6 1618
 
5.4%
7 1495
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
E 2
33.3%
M 2
33.3%
L 2
33.3%
Other Punctuation
ValueCountFrequency (%)
. 6821
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36575
> 99.9%
Latin 6
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9340
25.5%
. 6821
18.6%
1 4970
13.6%
2 2421
 
6.6%
8 2381
 
6.5%
3 2089
 
5.7%
9 1985
 
5.4%
4 1733
 
4.7%
5 1722
 
4.7%
6 1618
 
4.4%
Latin
ValueCountFrequency (%)
E 2
33.3%
M 2
33.3%
L 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36581
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9340
25.5%
. 6821
18.6%
1 4970
13.6%
2 2421
 
6.6%
8 2381
 
6.5%
3 2089
 
5.7%
9 1985
 
5.4%
4 1733
 
4.7%
5 1722
 
4.7%
6 1618
 
4.4%
Other values (4) 1501
 
4.1%

elevationAccuracy
Text

Missing 

Distinct71
Distinct (%)2.0%
Missing1922885
Missing (%)99.8%
Memory size14.7 MiB
2025-01-08T17:53:04.627852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length3
Mean length3.14395667
Min length3

Characters and Unicode

Total characters11029
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)0.9%

Sample

1st row0.0
2nd row0.0
3rd row25.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0 3089
88.1%
25.0 201
 
5.7%
152.5 19
 
0.5%
13.0 13
 
0.4%
20.0 11
 
0.3%
53.0 10
 
0.3%
1.5 10
 
0.3%
50.0 9
 
0.3%
305.0 9
 
0.3%
76.0 8
 
0.2%
Other values (61) 129
 
3.7%
2025-01-08T17:53:04.761945image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6593
59.8%
. 3505
31.8%
5 365
 
3.3%
2 286
 
2.6%
1 102
 
0.9%
3 56
 
0.5%
7 31
 
0.3%
4 29
 
0.3%
6 20
 
0.2%
8 18
 
0.2%
Other values (5) 24
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7512
68.1%
Other Punctuation 3509
31.8%
Dash Punctuation 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6593
87.8%
5 365
 
4.9%
2 286
 
3.8%
1 102
 
1.4%
3 56
 
0.7%
7 31
 
0.4%
4 29
 
0.4%
6 20
 
0.3%
8 18
 
0.2%
9 12
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 3505
99.9%
: 4
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11025
> 99.9%
Latin 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6593
59.8%
. 3505
31.8%
5 365
 
3.3%
2 286
 
2.6%
1 102
 
0.9%
3 56
 
0.5%
7 31
 
0.3%
4 29
 
0.3%
6 20
 
0.2%
8 18
 
0.2%
Other values (3) 20
 
0.2%
Latin
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11029
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6593
59.8%
. 3505
31.8%
5 365
 
3.3%
2 286
 
2.6%
1 102
 
0.9%
3 56
 
0.5%
7 31
 
0.3%
4 29
 
0.3%
6 20
 
0.2%
8 18
 
0.2%
Other values (5) 24
 
0.2%

depth
Text

Missing 

Distinct8763
Distinct (%)1.1%
Missing1143682
Missing (%)59.4%
Memory size14.7 MiB
2025-01-08T17:53:04.947096image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length20
Mean length4.480721492
Min length3

Characters and Unicode

Total characters3507110
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2354 ?
Unique (%)0.3%

Sample

1st row77.0
2nd row225.0
3rd row74.0
4th row265.0
5th row75.0
ValueCountFrequency (%)
0.5 20751
 
2.7%
1.0 11235
 
1.4%
84.0 9010
 
1.2%
82.0 8984
 
1.1%
18.0 8775
 
1.1%
15.0 8375
 
1.1%
3.0 8321
 
1.1%
27.0 7674
 
1.0%
55.0 7087
 
0.9%
2.0 6958
 
0.9%
Other values (8753) 685541
87.6%
2025-01-08T17:53:05.188815image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 874405
24.9%
. 782711
22.3%
1 336183
 
9.6%
5 309343
 
8.8%
2 253724
 
7.2%
3 194487
 
5.5%
4 177647
 
5.1%
8 153099
 
4.4%
6 147797
 
4.2%
7 141755
 
4.0%
Other values (5) 135959
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2724387
77.7%
Other Punctuation 782715
 
22.3%
Dash Punctuation 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 874405
32.1%
1 336183
 
12.3%
5 309343
 
11.4%
2 253724
 
9.3%
3 194487
 
7.1%
4 177647
 
6.5%
8 153099
 
5.6%
6 147797
 
5.4%
7 141755
 
5.2%
9 135947
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 782711
> 99.9%
: 4
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3507106
> 99.9%
Latin 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 874405
24.9%
. 782711
22.3%
1 336183
 
9.6%
5 309343
 
8.8%
2 253724
 
7.2%
3 194487
 
5.5%
4 177647
 
5.1%
8 153099
 
4.4%
6 147797
 
4.2%
7 141755
 
4.0%
Other values (3) 135955
 
3.9%
Latin
ValueCountFrequency (%)
T 2
50.0%
Z 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3507110
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 874405
24.9%
. 782711
22.3%
1 336183
 
9.6%
5 309343
 
8.8%
2 253724
 
7.2%
3 194487
 
5.5%
4 177647
 
5.1%
8 153099
 
4.4%
6 147797
 
4.2%
7 141755
 
4.0%
Other values (5) 135959
 
3.9%

depthAccuracy
Text

Missing 

Distinct1589
Distinct (%)0.2%
Missing1205339
Missing (%)62.6%
Memory size14.7 MiB
2025-01-08T17:53:05.362755image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length3
Mean length3.277599181
Min length3

Characters and Unicode

Total characters2363326
Distinct characters28
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique320 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row175.0
3rd row0.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0 518729
71.9%
0.5 27364
 
3.8%
1.0 10273
 
1.4%
2.0 8610
 
1.2%
1.5 8052
 
1.1%
2.5 7638
 
1.1%
4.5 5897
 
0.8%
5.0 5274
 
0.7%
3.0 4946
 
0.7%
9.0 3978
 
0.6%
Other values (1580) 120294
 
16.7%
2025-01-08T17:53:05.592020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1221509
51.7%
. 721051
30.5%
5 141990
 
6.0%
1 65416
 
2.8%
9 55282
 
2.3%
2 51770
 
2.2%
3 27061
 
1.1%
4 26874
 
1.1%
7 19469
 
0.8%
6 17914
 
0.8%
Other values (18) 14990
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1642248
69.5%
Other Punctuation 721052
30.5%
Lowercase Letter 23
 
< 0.1%
Uppercase Letter 2
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
21.7%
e 3
13.0%
i 2
 
8.7%
l 2
 
8.7%
t 2
 
8.7%
m 2
 
8.7%
o 1
 
4.3%
s 1
 
4.3%
n 1
 
4.3%
u 1
 
4.3%
Other values (3) 3
13.0%
Decimal Number
ValueCountFrequency (%)
0 1221509
74.4%
5 141990
 
8.6%
1 65416
 
4.0%
9 55282
 
3.4%
2 51770
 
3.2%
3 27061
 
1.6%
4 26874
 
1.6%
7 19469
 
1.2%
6 17914
 
1.1%
8 14963
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 721051
> 99.9%
, 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
N 1
50.0%
A 1
50.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2363301
> 99.9%
Latin 25
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
20.0%
e 3
12.0%
i 2
 
8.0%
l 2
 
8.0%
t 2
 
8.0%
m 2
 
8.0%
o 1
 
4.0%
N 1
 
4.0%
s 1
 
4.0%
n 1
 
4.0%
Other values (5) 5
20.0%
Common
ValueCountFrequency (%)
0 1221509
51.7%
. 721051
30.5%
5 141990
 
6.0%
1 65416
 
2.8%
9 55282
 
2.3%
2 51770
 
2.2%
3 27061
 
1.1%
4 26874
 
1.1%
7 19469
 
0.8%
6 17914
 
0.8%
Other values (3) 14965
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2363326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1221509
51.7%
. 721051
30.5%
5 141990
 
6.0%
1 65416
 
2.8%
9 55282
 
2.3%
2 51770
 
2.2%
3 27061
 
1.1%
4 26874
 
1.1%
7 19469
 
0.8%
6 17914
 
0.8%
Other values (18) 14990
 
0.6%
Distinct603
Distinct (%)6.8%
Missing1917545
Missing (%)99.5%
Memory size14.7 MiB
2025-01-08T17:53:05.734703image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length18
Mean length12.94586347
Min length3

Characters and Unicode

Total characters114545
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique205 ?
Unique (%)2.3%

Sample

1st row0.0
2nd row511.15289545417056
3rd row32.07008492372621
4th row1726.5254814515185
5th row1860.2902638338219
ValueCountFrequency (%)
0.0 2777
31.4%
511.15289545417056 887
 
10.0%
365.9456782615661 341
 
3.9%
1436.265124532336 162
 
1.8%
3843.282664940326 125
 
1.4%
3.650579245692265 104
 
1.2%
1878.9020459397648 83
 
0.9%
1726.5254814515185 80
 
0.9%
857.2535535849795 75
 
0.8%
1809.5904164098843 71
 
0.8%
Other values (593) 4143
46.8%
2025-01-08T17:53:05.939414image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 13488
11.8%
0 13325
11.6%
1 12344
10.8%
4 11004
9.6%
6 10191
8.9%
2 10054
8.8%
8 9236
8.1%
9 9120
8.0%
3 9109
8.0%
. 8847
7.7%
Other values (9) 7827
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 105688
92.3%
Other Punctuation 8847
 
7.7%
Lowercase Letter 7
 
< 0.1%
Uppercase Letter 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 13488
12.8%
0 13325
12.6%
1 12344
11.7%
4 11004
10.4%
6 10191
9.6%
2 10054
9.5%
8 9236
8.7%
9 9120
8.6%
3 9109
8.6%
7 7817
7.4%
Lowercase Letter
ValueCountFrequency (%)
i 2
28.6%
a 2
28.6%
n 1
14.3%
m 1
14.3%
l 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
E 1
50.0%
A 1
50.0%
Other Punctuation
ValueCountFrequency (%)
. 8847
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 114536
> 99.9%
Latin 9
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
5 13488
11.8%
0 13325
11.6%
1 12344
10.8%
4 11004
9.6%
6 10191
8.9%
2 10054
8.8%
8 9236
8.1%
9 9120
8.0%
3 9109
8.0%
. 8847
7.7%
Other values (2) 7818
6.8%
Latin
ValueCountFrequency (%)
i 2
22.2%
a 2
22.2%
E 1
11.1%
A 1
11.1%
n 1
11.1%
m 1
11.1%
l 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 114545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 13488
11.8%
0 13325
11.6%
1 12344
10.8%
4 11004
9.6%
6 10191
8.9%
2 10054
8.8%
8 9236
8.1%
9 9120
8.0%
3 9109
8.0%
. 8847
7.7%
Other values (9) 7827
6.8%

issue
Text

Distinct402
Distinct (%)< 0.1%
Missing37
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:06.019398image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length209
Median length204
Mean length89.0387161
Min length8

Characters and Unicode

Total characters171520265
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique86 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_INVALID
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COUNTRY;CONTINENT_INVALID
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84;CONTINENT_INVALID
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COUNTRY
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_invalid 516478
26.8%
occurrence_status_inferred_from_individual_count 418366
21.7%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 224592
11.7%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;continent_invalid 212163
11.0%
occurrence_status_inferred_from_individual_count;continent_derived_from_country 195778
 
10.2%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates 50454
 
2.6%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 36575
 
1.9%
occurrence_status_inferred_from_individual_count;continent_derived_from_coordinates 32128
 
1.7%
occurrence_status_inferred_from_individual_count;country_derived_from_coordinates;geodetic_datum_assumed_wgs84;continent_invalid 27721
 
1.4%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;taxon_match_higherrank 25845
 
1.3%
Other values (392) 186256
 
9.7%
2025-01-08T17:53:06.159617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 16496084
9.6%
N 15722593
 
9.2%
E 14675877
 
8.6%
I 14123173
 
8.2%
T 12772362
 
7.4%
R 12496875
 
7.3%
D 11817130
 
6.9%
C 11680988
 
6.8%
O 10988569
 
6.4%
U 10155291
 
5.9%
Other values (24) 40591323
23.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 150080785
87.5%
Connector Punctuation 16496084
 
9.6%
Other Punctuation 3079317
 
1.8%
Decimal Number 1864072
 
1.1%
Lowercase Letter 7
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 15722593
10.5%
E 14675877
9.8%
I 14123173
9.4%
T 12772362
8.5%
R 12496875
8.3%
D 11817130
7.9%
C 11680988
7.8%
O 10988569
 
7.3%
U 10155291
 
6.8%
A 7703967
 
5.1%
Other values (14) 27943960
18.6%
Lowercase Letter
ValueCountFrequency (%)
a 2
28.6%
e 1
14.3%
m 1
14.3%
t 1
14.3%
o 1
14.3%
d 1
14.3%
Decimal Number
ValueCountFrequency (%)
8 932036
50.0%
4 932036
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 16496084
100.0%
Other Punctuation
ValueCountFrequency (%)
; 3079317
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 150080792
87.5%
Common 21439473
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 15722593
10.5%
E 14675877
9.8%
I 14123173
9.4%
T 12772362
8.5%
R 12496875
8.3%
D 11817130
7.9%
C 11680988
7.8%
O 10988569
 
7.3%
U 10155291
 
6.8%
A 7703967
 
5.1%
Other values (20) 27943967
18.6%
Common
ValueCountFrequency (%)
_ 16496084
76.9%
; 3079317
 
14.4%
8 932036
 
4.3%
4 932036
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 171520265
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 16496084
9.6%
N 15722593
 
9.2%
E 14675877
 
8.6%
I 14123173
 
8.2%
T 12772362
 
7.4%
R 12496875
 
7.3%
D 11817130
 
6.9%
C 11680988
 
6.8%
O 10988569
 
6.4%
U 10155291
 
5.9%
Other values (24) 40591323
23.7%

mediaType
Text

Missing 

Distinct73
Distinct (%)< 0.1%
Missing1683241
Missing (%)87.4%
Memory size14.7 MiB
2025-01-08T17:53:06.217358image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1704
Median length10
Mean length13.26034744
Min length5

Characters and Unicode

Total characters3224280
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowStillImage
2nd rowStillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage
ValueCountFrequency (%)
stillimage 220054
90.5%
stillimage;stillimage 12696
 
5.2%
stillimage;stillimage;stillimage 3561
 
1.5%
stillimage;stillimage;stillimage;stillimage 2030
 
0.8%
stillimage;stillimage;stillimage;stillimage;stillimage 1055
 
0.4%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 769
 
0.3%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 533
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 390
 
0.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 309
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 213
 
0.1%
Other values (63) 1542
 
0.6%
2025-01-08T17:53:06.356617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 630442
19.6%
a 315222
9.8%
e 315222
9.8%
S 315220
9.8%
t 315220
9.8%
i 315220
9.8%
I 315220
9.8%
m 315220
9.8%
g 315220
9.8%
; 72070
 
2.2%
Other values (2) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2521770
78.2%
Uppercase Letter 630440
 
19.6%
Other Punctuation 72070
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 630442
25.0%
a 315222
12.5%
e 315222
12.5%
t 315220
12.5%
i 315220
12.5%
m 315220
12.5%
g 315220
12.5%
f 2
 
< 0.1%
s 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 315220
50.0%
I 315220
50.0%
Other Punctuation
ValueCountFrequency (%)
; 72070
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3152210
97.8%
Common 72070
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 630442
20.0%
a 315222
10.0%
e 315222
10.0%
S 315220
10.0%
t 315220
10.0%
i 315220
10.0%
I 315220
10.0%
m 315220
10.0%
g 315220
10.0%
f 2
 
< 0.1%
Common
ValueCountFrequency (%)
; 72070
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3224280
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 630442
19.6%
a 315222
9.8%
e 315222
9.8%
S 315220
9.8%
t 315220
9.8%
i 315220
9.8%
I 315220
9.8%
m 315220
9.8%
g 315220
9.8%
; 72070
 
2.2%
Other values (2) 4
 
< 0.1%
Distinct7
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:06.416618image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length4
Mean length4.481446144
Min length4

Characters and Unicode

Total characters8633022
Distinct characters43
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowtrue
2nd rowfalse
3rd rowfalse
4th rowtrue
5th rowfalse
ValueCountFrequency (%)
true 999047
51.9%
false 927340
48.1%
latin_america 1
 
< 0.1%
echinorhynchus 1
 
< 0.1%
lageniformis 1
 
< 0.1%
ekbaum 1
 
< 0.1%
1938 1
 
< 0.1%
setaria 1
 
< 0.1%
labiatopapillosa 1
 
< 0.1%
alessandrini 1
 
< 0.1%
Other values (5) 5
 
< 0.1%
2025-01-08T17:53:06.525843image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1926390
22.3%
r 999054
11.6%
t 999050
11.6%
u 999050
11.6%
a 927351
10.7%
l 927345
10.7%
s 927345
10.7%
f 927341
10.7%
i 9
 
< 0.1%
8
 
< 0.1%
Other values (33) 79
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8632965
> 99.9%
Uppercase Letter 30
 
< 0.1%
Decimal Number 12
 
< 0.1%
Space Separator 8
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Connector Punctuation 2
 
< 0.1%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1926390
22.3%
r 999054
11.6%
t 999050
11.6%
u 999050
11.6%
a 927351
10.7%
l 927345
10.7%
s 927345
10.7%
f 927341
10.7%
i 9
 
< 0.1%
n 6
 
< 0.1%
Other values (10) 24
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 6
20.0%
E 4
13.3%
I 3
10.0%
R 3
10.0%
N 2
 
6.7%
T 2
 
6.7%
C 2
 
6.7%
M 2
 
6.7%
S 2
 
6.7%
O 1
 
3.3%
Other values (3) 3
10.0%
Decimal Number
ValueCountFrequency (%)
8 4
33.3%
1 3
25.0%
3 2
16.7%
9 2
16.7%
7 1
 
8.3%
Space Separator
ValueCountFrequency (%)
8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8632995
> 99.9%
Common 27
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1926390
22.3%
r 999054
11.6%
t 999050
11.6%
u 999050
11.6%
a 927351
10.7%
l 927345
10.7%
s 927345
10.7%
f 927341
10.7%
i 9
 
< 0.1%
n 6
 
< 0.1%
Other values (23) 54
 
< 0.1%
Common
ValueCountFrequency (%)
8
29.6%
8 4
14.8%
1 3
 
11.1%
, 3
 
11.1%
3 2
 
7.4%
9 2
 
7.4%
_ 2
 
7.4%
7 1
 
3.7%
) 1
 
3.7%
( 1
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8633022
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1926390
22.3%
r 999054
11.6%
t 999050
11.6%
u 999050
11.6%
a 927351
10.7%
l 927345
10.7%
s 927345
10.7%
f 927341
10.7%
i 9
 
< 0.1%
8
 
< 0.1%
Other values (33) 79
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:06.572844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length5
Mean length4.98582114
Min length4

Characters and Unicode

Total characters9604631
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 1899057
98.6%
true 27330
 
1.4%
north_america 2
 
< 0.1%
2025-01-08T17:53:06.664265image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1926387
20.1%
f 1899057
19.8%
l 1899057
19.8%
s 1899057
19.8%
a 1899057
19.8%
t 27330
 
0.3%
r 27330
 
0.3%
u 27330
 
0.3%
A 4
 
< 0.1%
R 4
 
< 0.1%
Other values (9) 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9604605
> 99.9%
Uppercase Letter 24
 
< 0.1%
Connector Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 4
16.7%
R 4
16.7%
I 2
8.3%
E 2
8.3%
M 2
8.3%
O 2
8.3%
H 2
8.3%
T 2
8.3%
N 2
8.3%
C 2
8.3%
Lowercase Letter
ValueCountFrequency (%)
e 1926387
20.1%
f 1899057
19.8%
l 1899057
19.8%
s 1899057
19.8%
a 1899057
19.8%
t 27330
 
0.3%
r 27330
 
0.3%
u 27330
 
0.3%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9604629
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1926387
20.1%
f 1899057
19.8%
l 1899057
19.8%
s 1899057
19.8%
a 1899057
19.8%
t 27330
 
0.3%
r 27330
 
0.3%
u 27330
 
0.3%
A 4
 
< 0.1%
R 4
 
< 0.1%
Other values (8) 16
 
< 0.1%
Common
ValueCountFrequency (%)
_ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9604631
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1926387
20.1%
f 1899057
19.8%
l 1899057
19.8%
s 1899057
19.8%
a 1899057
19.8%
t 27330
 
0.3%
r 27330
 
0.3%
u 27330
 
0.3%
A 4
 
< 0.1%
R 4
 
< 0.1%
Other values (9) 18
 
< 0.1%
Distinct113080
Distinct (%)5.9%
Missing5
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:06.881644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.459736045
Min length1

Characters and Unicode

Total characters12443958
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38722 ?
Unique (%)2.0%

Sample

1st row2237154
2nd row5189992
3rd row2258402
4th row5187825
5th row6104288
ValueCountFrequency (%)
225 23786
 
1.2%
5967481 15294
 
0.8%
105 11162
 
0.6%
52 8679
 
0.5%
7296 8105
 
0.4%
637 6531
 
0.3%
137 6331
 
0.3%
6540 4668
 
0.2%
8166676 4580
 
0.2%
256 4175
 
0.2%
Other values (113070) 1833077
95.2%
2025-01-08T17:53:07.157370image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2383907
19.2%
5 1295476
10.4%
1 1222710
9.8%
3 1175870
9.4%
8 1103667
8.9%
7 1093695
8.8%
4 1090035
8.8%
6 1059267
8.5%
9 1047606
8.4%
0 971722
7.8%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12443955
> 99.9%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2383907
19.2%
5 1295476
10.4%
1 1222710
9.8%
3 1175870
9.4%
8 1103667
8.9%
7 1093695
8.8%
4 1090035
8.8%
6 1059267
8.5%
9 1047606
8.4%
0 971722
7.8%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
E 1
33.3%
X 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 12443955
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2383907
19.2%
5 1295476
10.4%
1 1222710
9.8%
3 1175870
9.4%
8 1103667
8.9%
7 1093695
8.8%
4 1090035
8.8%
6 1059267
8.5%
9 1047606
8.4%
0 971722
7.8%
Latin
ValueCountFrequency (%)
M 1
33.3%
E 1
33.3%
X 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12443958
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2383907
19.2%
5 1295476
10.4%
1 1222710
9.8%
3 1175870
9.4%
8 1103667
8.9%
7 1093695
8.8%
4 1090035
8.8%
6 1059267
8.5%
9 1047606
8.4%
0 971722
7.8%
Other values (3) 3
 
< 0.1%
Distinct94525
Distinct (%)4.9%
Missing2070
Missing (%)0.1%
Memory size14.7 MiB
2025-01-08T17:53:07.368382image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.457625877
Min length1

Characters and Unicode

Total characters12426558
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27026 ?
Unique (%)1.4%

Sample

1st row2237081
2nd row5189992
3rd row2258402
4th row5187825
5th row9722403
ValueCountFrequency (%)
225 23786
 
1.2%
5967481 15294
 
0.8%
105 11162
 
0.6%
52 8679
 
0.5%
7296 8105
 
0.4%
637 6531
 
0.3%
137 6505
 
0.3%
6540 4668
 
0.2%
255 4580
 
0.2%
256 4175
 
0.2%
Other values (94515) 1830838
95.1%
2025-01-08T17:53:07.638044image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Other values (6) 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12426552
> 99.9%
Lowercase Letter 5
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Lowercase Letter
ValueCountFrequency (%)
é 1
20.0%
x 1
20.0%
i 1
20.0%
c 1
20.0%
o 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12426552
> 99.9%
Latin 6
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Latin
ValueCountFrequency (%)
M 1
16.7%
é 1
16.7%
x 1
16.7%
i 1
16.7%
c 1
16.7%
o 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12426557
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2351717
18.9%
5 1313419
10.6%
1 1213699
9.8%
3 1149846
9.3%
8 1103530
8.9%
7 1101446
8.9%
4 1092660
8.8%
9 1069473
8.6%
6 1061655
8.5%
0 969107
7.8%
Other values (5) 5
 
< 0.1%
None
ValueCountFrequency (%)
é 1
100.0%
Distinct6
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:07.691727image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length1
Mean length1.000003115
Min length1

Characters and Unicode

Total characters1926394
Distinct characters11
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 1920497
99.7%
4 2826
 
0.1%
0 2065
 
0.1%
7 964
 
0.1%
3 35
 
< 0.1%
mex.2_1 1
 
< 0.1%
2025-01-08T17:53:07.783745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1920498
99.7%
4 2826
 
0.1%
0 2065
 
0.1%
7 964
 
0.1%
3 35
 
< 0.1%
M 1
 
< 0.1%
E 1
 
< 0.1%
X 1
 
< 0.1%
. 1
 
< 0.1%
2 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1926389
> 99.9%
Uppercase Letter 3
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1920498
99.7%
4 2826
 
0.1%
0 2065
 
0.1%
7 964
 
0.1%
3 35
 
< 0.1%
2 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
E 1
33.3%
X 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1926391
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1920498
99.7%
4 2826
 
0.1%
0 2065
 
0.1%
7 964
 
0.1%
3 35
 
< 0.1%
. 1
 
< 0.1%
2 1
 
< 0.1%
_ 1
 
< 0.1%
Latin
ValueCountFrequency (%)
M 1
33.3%
E 1
33.3%
X 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1926394
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1920498
99.7%
4 2826
 
0.1%
0 2065
 
0.1%
7 964
 
0.1%
3 35
 
< 0.1%
M 1
 
< 0.1%
E 1
 
< 0.1%
X 1
 
< 0.1%
. 1
 
< 0.1%
2 1
 
< 0.1%
Distinct52
Distinct (%)< 0.1%
Missing3161
Missing (%)0.2%
Memory size14.7 MiB
2025-01-08T17:53:07.837465image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length2
Mean length2.24715167
Min length2

Characters and Unicode

Total characters4321794
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st row105
2nd row52
3rd row43
4th row50
5th row52
ValueCountFrequency (%)
52 864192
44.9%
54 392999
20.4%
42 241615
 
12.6%
43 117703
 
6.1%
50 91212
 
4.7%
5967481 68758
 
3.6%
108 45840
 
2.4%
105 32733
 
1.7%
44 19745
 
1.0%
74 10415
 
0.5%
Other values (44) 38022
 
2.0%
2025-01-08T17:53:07.941560image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1482086
34.3%
2 1107814
25.6%
4 872475
20.2%
0 178943
 
4.1%
1 152025
 
3.5%
3 135402
 
3.1%
8 125240
 
2.9%
9 93678
 
2.2%
7 92947
 
2.2%
6 81165
 
1.9%
Other values (13) 19
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4321775
> 99.9%
Lowercase Letter 14
 
< 0.1%
Uppercase Letter 3
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1482086
34.3%
2 1107814
25.6%
4 872475
20.2%
0 178943
 
4.1%
1 152025
 
3.5%
3 135402
 
3.1%
8 125240
 
2.9%
9 93678
 
2.2%
7 92947
 
2.2%
6 81165
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
a 4
28.6%
i 2
14.3%
r 2
14.3%
j 1
 
7.1%
l 1
 
7.1%
f 1
 
7.1%
o 1
 
7.1%
n 1
 
7.1%
u 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
B 1
33.3%
C 1
33.3%
S 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4321777
> 99.9%
Latin 17
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
23.5%
i 2
11.8%
r 2
11.8%
B 1
 
5.9%
j 1
 
5.9%
C 1
 
5.9%
l 1
 
5.9%
f 1
 
5.9%
o 1
 
5.9%
n 1
 
5.9%
Other values (2) 2
11.8%
Common
ValueCountFrequency (%)
5 1482086
34.3%
2 1107814
25.6%
4 872475
20.2%
0 178943
 
4.1%
1 152025
 
3.5%
3 135402
 
3.1%
8 125240
 
2.9%
9 93678
 
2.2%
7 92947
 
2.2%
6 81165
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4321794
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1482086
34.3%
2 1107814
25.6%
4 872475
20.2%
0 178943
 
4.1%
1 152025
 
3.5%
3 135402
 
3.1%
8 125240
 
2.9%
9 93678
 
2.2%
7 92947
 
2.2%
6 81165
 
1.9%
Other values (13) 19
 
< 0.1%

classKey
Text

Missing 

Distinct115
Distinct (%)< 0.1%
Missing66158
Missing (%)3.4%
Memory size14.7 MiB
2025-01-08T17:53:08.018611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.288918873
Min length3

Characters and Unicode

Total characters6118162
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st row308
2nd row225
3rd row206
4th row350
5th row225
ValueCountFrequency (%)
225 610123
32.8%
229 301912
16.2%
256 211086
 
11.3%
137 207854
 
11.2%
206 93050
 
5.0%
11545536 46190
 
2.5%
11133537 42750
 
2.3%
255 30336
 
1.6%
350 27087
 
1.5%
214 25635
 
1.4%
Other values (105) 264212
14.2%
2025-01-08T17:53:08.153203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2351939
38.4%
5 1212864
19.8%
1 577401
 
9.4%
3 547057
 
8.9%
6 416257
 
6.8%
9 358589
 
5.9%
7 296942
 
4.9%
4 172597
 
2.8%
0 166184
 
2.7%
8 18326
 
0.3%
Other values (5) 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6118156
> 99.9%
Uppercase Letter 3
 
< 0.1%
Other Punctuation 2
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2351939
38.4%
5 1212864
19.8%
1 577401
 
9.4%
3 547057
 
8.9%
6 416257
 
6.8%
9 358589
 
5.9%
7 296942
 
4.9%
4 172597
 
2.8%
0 166184
 
2.7%
8 18326
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
E 1
33.3%
X 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6118159
> 99.9%
Latin 3
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2351939
38.4%
5 1212864
19.8%
1 577401
 
9.4%
3 547057
 
8.9%
6 416257
 
6.8%
9 358589
 
5.9%
7 296942
 
4.9%
4 172597
 
2.8%
0 166184
 
2.7%
8 18326
 
0.3%
Other values (2) 3
 
< 0.1%
Latin
ValueCountFrequency (%)
M 1
33.3%
E 1
33.3%
X 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6118162
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2351939
38.4%
5 1212864
19.8%
1 577401
 
9.4%
3 547057
 
8.9%
6 416257
 
6.8%
9 358589
 
5.9%
7 296942
 
4.9%
4 172597
 
2.8%
0 166184
 
2.7%
8 18326
 
0.3%
Other values (5) 6
 
< 0.1%

orderKey
Text

Missing 

Distinct418
Distinct (%)< 0.1%
Missing329533
Missing (%)17.1%
Memory size14.7 MiB
2025-01-08T17:53:08.328434image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length80
Median length71
Mean length4.576634771
Min length3

Characters and Unicode

Total characters7308245
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)< 0.1%

Sample

1st row1184
2nd row454
3rd row831
4th row9661062
5th row7390893
ValueCountFrequency (%)
637 196384
 
12.3%
982 156428
 
9.8%
1456 116401
 
7.3%
7390893 113553
 
7.1%
1079 69439
 
4.3%
714 54200
 
3.4%
1231 49533
 
3.1%
440 35176
 
2.2%
9310756 31275
 
2.0%
9529005 30439
 
1.9%
Other values (419) 744045
46.6%
2025-01-08T17:53:08.571261image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 1021057
14.0%
1 883038
12.1%
3 812835
11.1%
7 809815
11.1%
4 735664
10.1%
0 683837
9.4%
6 677497
9.3%
8 654881
9.0%
2 518629
7.1%
5 510778
7.0%
Other values (29) 214
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7308031
> 99.9%
Lowercase Letter 172
 
< 0.1%
Uppercase Letter 17
 
< 0.1%
Space Separator 13
 
< 0.1%
Other Punctuation 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 30
17.4%
i 16
9.3%
h 16
9.3%
e 14
8.1%
o 14
8.1%
n 14
8.1%
c 12
 
7.0%
l 10
 
5.8%
r 9
 
5.2%
d 7
 
4.1%
Other values (8) 30
17.4%
Decimal Number
ValueCountFrequency (%)
9 1021057
14.0%
1 883038
12.1%
3 812835
11.1%
7 809815
11.1%
4 735664
10.1%
0 683837
9.4%
6 677497
9.3%
8 654881
9.0%
2 518629
7.1%
5 510778
7.0%
Uppercase Letter
ValueCountFrequency (%)
P 4
23.5%
A 4
23.5%
S 2
11.8%
E 2
11.8%
M 1
 
5.9%
N 1
 
5.9%
C 1
 
5.9%
O 1
 
5.9%
L 1
 
5.9%
Space Separator
ValueCountFrequency (%)
13
100.0%
Other Punctuation
ValueCountFrequency (%)
, 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7308056
> 99.9%
Latin 189
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 30
15.9%
i 16
 
8.5%
h 16
 
8.5%
e 14
 
7.4%
o 14
 
7.4%
n 14
 
7.4%
c 12
 
6.3%
l 10
 
5.3%
r 9
 
4.8%
d 7
 
3.7%
Other values (17) 47
24.9%
Common
ValueCountFrequency (%)
9 1021057
14.0%
1 883038
12.1%
3 812835
11.1%
7 809815
11.1%
4 735664
10.1%
0 683837
9.4%
6 677497
9.3%
8 654881
9.0%
2 518629
7.1%
5 510778
7.0%
Other values (2) 25
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7308245
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 1021057
14.0%
1 883038
12.1%
3 812835
11.1%
7 809815
11.1%
4 735664
10.1%
0 683837
9.4%
6 677497
9.3%
8 654881
9.0%
2 518629
7.1%
5 510778
7.0%
Other values (29) 214
 
< 0.1%

familyKey
Text

Missing 

Distinct3525
Distinct (%)0.2%
Missing144485
Missing (%)7.5%
Memory size14.7 MiB
2025-01-08T17:53:08.758999image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.434517382
Min length4

Characters and Unicode

Total characters7901902
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique272 ?
Unique (%)< 0.1%

Sample

1st row4305937
2nd row2850
3rd row3362
4th row3249433
5th row2675
ValueCountFrequency (%)
4479 28956
 
1.6%
6779 28425
 
1.6%
3461 26787
 
1.5%
2304120 22783
 
1.3%
3445 18640
 
1.0%
2675 16831
 
0.9%
6760 16777
 
0.9%
3595 15856
 
0.9%
3588 14115
 
0.8%
3472 12961
 
0.7%
Other values (3515) 1579777
88.7%
2025-01-08T17:53:09.007419image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 1049135
13.3%
4 928679
11.8%
2 887515
11.2%
5 878476
11.1%
7 843755
10.7%
8 810631
10.3%
3 797950
10.1%
9 648125
8.2%
0 542346
6.9%
1 515266
6.5%
Other values (6) 24
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7901878
> 99.9%
Lowercase Letter 21
 
< 0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 1049135
13.3%
4 928679
11.8%
2 887515
11.2%
5 878476
11.1%
7 843755
10.7%
8 810631
10.3%
3 797950
10.1%
9 648125
8.2%
0 542346
6.9%
1 515266
6.5%
Lowercase Letter
ValueCountFrequency (%)
i 6
28.6%
a 6
28.6%
n 3
14.3%
m 3
14.3%
l 3
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7901878
> 99.9%
Latin 24
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
6 1049135
13.3%
4 928679
11.8%
2 887515
11.2%
5 878476
11.1%
7 843755
10.7%
8 810631
10.3%
3 797950
10.1%
9 648125
8.2%
0 542346
6.9%
1 515266
6.5%
Latin
ValueCountFrequency (%)
i 6
25.0%
a 6
25.0%
A 3
12.5%
n 3
12.5%
m 3
12.5%
l 3
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7901902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 1049135
13.3%
4 928679
11.8%
2 887515
11.2%
5 878476
11.1%
7 843755
10.7%
8 810631
10.3%
3 797950
10.1%
9 648125
8.2%
0 542346
6.9%
1 515266
6.5%
Other values (6) 24
 
< 0.1%

genusKey
Text

Missing 

Distinct20902
Distinct (%)1.3%
Missing358041
Missing (%)18.6%
Memory size14.7 MiB
2025-01-08T17:53:09.206644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length7
Mean length7.014656149
Min length7

Characters and Unicode

Total characters11001450
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3190 ?
Unique (%)0.2%

Sample

1st row2237081
2nd row9798512
3rd row2258400
4th row2275832
5th row4628849
ValueCountFrequency (%)
9819702 22884
 
1.5%
8179898 8956
 
0.6%
2227317 8948
 
0.6%
4646327 8189
 
0.5%
2227127 8096
 
0.5%
2318625 5223
 
0.3%
5189970 4536
 
0.3%
2302962 4534
 
0.3%
2224189 4234
 
0.3%
2301998 4085
 
0.3%
Other values (20892) 1488667
94.9%
2025-01-08T17:53:09.468203image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2691752
24.5%
3 1078641
9.8%
4 1008012
 
9.2%
1 1001812
 
9.1%
9 940816
 
8.6%
0 923361
 
8.4%
8 919570
 
8.4%
7 830784
 
7.6%
6 809158
 
7.4%
5 797507
 
7.2%
Other values (17) 37
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11001413
> 99.9%
Lowercase Letter 34
 
< 0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6
17.6%
e 4
11.8%
h 4
11.8%
t 4
11.8%
l 3
8.8%
m 2
 
5.9%
n 2
 
5.9%
c 2
 
5.9%
o 2
 
5.9%
y 1
 
2.9%
Other values (4) 4
11.8%
Decimal Number
ValueCountFrequency (%)
2 2691752
24.5%
3 1078641
9.8%
4 1008012
 
9.2%
1 1001812
 
9.1%
9 940816
 
8.6%
0 923361
 
8.4%
8 919570
 
8.4%
7 830784
 
7.6%
6 809158
 
7.4%
5 797507
 
7.2%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
A 1
33.3%
N 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 11001413
> 99.9%
Latin 37
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6
16.2%
e 4
10.8%
h 4
10.8%
t 4
10.8%
l 3
8.1%
m 2
 
5.4%
n 2
 
5.4%
c 2
 
5.4%
o 2
 
5.4%
y 1
 
2.7%
Other values (7) 7
18.9%
Common
ValueCountFrequency (%)
2 2691752
24.5%
3 1078641
9.8%
4 1008012
 
9.2%
1 1001812
 
9.1%
9 940816
 
8.6%
0 923361
 
8.4%
8 919570
 
8.4%
7 830784
 
7.6%
6 809158
 
7.4%
5 797507
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11001450
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2691752
24.5%
3 1078641
9.8%
4 1008012
 
9.2%
1 1001812
 
9.1%
9 940816
 
8.6%
0 923361
 
8.4%
8 919570
 
8.4%
7 830784
 
7.6%
6 809158
 
7.4%
5 797507
 
7.2%
Other values (17) 37
 
< 0.1%

subgenusKey
Text

Missing 

Distinct4
Distinct (%)80.0%
Missing1926388
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:09.532641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length11
Mean length8.6
Min length2

Characters and Unicode

Total characters43
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)60.0%

Sample

1st rowNE
2nd rowPalaeacanthocephala
3rd rowChromadorea
4th rowMonogenea
5th rowNE
ValueCountFrequency (%)
ne 2
40.0%
palaeacanthocephala 1
20.0%
chromadorea 1
20.0%
monogenea 1
20.0%
2025-01-08T17:53:09.628975image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 9
20.9%
o 5
11.6%
e 5
11.6%
h 3
 
7.0%
n 3
 
7.0%
r 2
 
4.7%
E 2
 
4.7%
N 2
 
4.7%
c 2
 
4.7%
l 2
 
4.7%
Other values (8) 8
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36
83.7%
Uppercase Letter 7
 
16.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 9
25.0%
o 5
13.9%
e 5
13.9%
h 3
 
8.3%
n 3
 
8.3%
r 2
 
5.6%
c 2
 
5.6%
l 2
 
5.6%
t 1
 
2.8%
p 1
 
2.8%
Other values (3) 3
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
E 2
28.6%
N 2
28.6%
C 1
14.3%
P 1
14.3%
M 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 43
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 9
20.9%
o 5
11.6%
e 5
11.6%
h 3
 
7.0%
n 3
 
7.0%
r 2
 
4.7%
E 2
 
4.7%
N 2
 
4.7%
c 2
 
4.7%
l 2
 
4.7%
Other values (8) 8
18.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 9
20.9%
o 5
11.6%
e 5
11.6%
h 3
 
7.0%
n 3
 
7.0%
r 2
 
4.7%
E 2
 
4.7%
N 2
 
4.7%
c 2
 
4.7%
l 2
 
4.7%
Other values (8) 8
18.6%

speciesKey
Text

Missing 

Distinct81482
Distinct (%)6.3%
Missing626819
Missing (%)32.5%
Memory size14.7 MiB
2025-01-08T17:53:09.832897image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length7
Mean length7.042964079
Min length7

Characters and Unicode

Total characters9152853
Distinct characters28
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23449 ?
Unique (%)1.8%

Sample

1st row5189992
2nd row2258402
3rd row5187825
4th row9722403
5th row2274145
ValueCountFrequency (%)
2318104 2020
 
0.2%
5728138 1518
 
0.1%
7823183 1512
 
0.1%
2321421 1479
 
0.1%
9029731 1415
 
0.1%
2227405 1414
 
0.1%
2227381 1402
 
0.1%
5724968 1368
 
0.1%
2509463 1354
 
0.1%
8971201 1324
 
0.1%
Other values (81472) 1284768
98.9%
2025-01-08T17:53:10.240031image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1727288
18.9%
5 965514
10.5%
1 941828
10.3%
3 831165
9.1%
8 829394
9.1%
7 811641
8.9%
9 808573
8.8%
4 781033
8.5%
6 737014
8.1%
0 719364
7.9%
Other values (18) 39
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9152814
> 99.9%
Lowercase Letter 36
 
< 0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5
13.9%
a 5
13.9%
d 4
11.1%
h 4
11.1%
t 3
8.3%
o 3
8.3%
y 2
 
5.6%
n 2
 
5.6%
c 2
 
5.6%
r 1
 
2.8%
Other values (5) 5
13.9%
Decimal Number
ValueCountFrequency (%)
2 1727288
18.9%
5 965514
10.5%
1 941828
10.3%
3 831165
9.1%
8 829394
9.1%
7 811641
8.9%
9 808573
8.8%
4 781033
8.5%
6 737014
8.1%
0 719364
7.9%
Uppercase Letter
ValueCountFrequency (%)
R 1
33.3%
E 1
33.3%
P 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 9152814
> 99.9%
Latin 39
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 5
12.8%
a 5
12.8%
d 4
10.3%
h 4
10.3%
t 3
 
7.7%
o 3
 
7.7%
y 2
 
5.1%
n 2
 
5.1%
c 2
 
5.1%
r 1
 
2.6%
Other values (8) 8
20.5%
Common
ValueCountFrequency (%)
2 1727288
18.9%
5 965514
10.5%
1 941828
10.3%
3 831165
9.1%
8 829394
9.1%
7 811641
8.9%
9 808573
8.8%
4 781033
8.5%
6 737014
8.1%
0 719364
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9152853
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1727288
18.9%
5 965514
10.5%
1 941828
10.3%
3 831165
9.1%
8 829394
9.1%
7 811641
8.9%
9 808573
8.8%
4 781033
8.5%
6 737014
8.1%
0 719364
7.9%
Other values (18) 39
 
< 0.1%

species
Text

Missing 

Distinct81449
Distinct (%)6.3%
Missing626822
Missing (%)32.5%
Memory size14.7 MiB
2025-01-08T17:53:10.445103image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length41
Median length36
Mean length18.98173243
Min length7

Characters and Unicode

Total characters24668109
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23438 ?
Unique (%)1.8%

Sample

1st rowBulla striata
2nd rowStylopathes columnaris
3rd rowOphiothrix suensonii
4th rowNaria labrolineata
5th rowLysasterias heteractis
ValueCountFrequency (%)
conus 21648
 
0.8%
cerithium 8891
 
0.3%
cambarus 8740
 
0.3%
faxonius 8187
 
0.3%
procambarus 8031
 
0.3%
gracilis 6079
 
0.2%
aricidea 4891
 
0.2%
nassarius 4086
 
0.2%
pagurus 3943
 
0.2%
oliva 3823
 
0.1%
Other values (55326) 2520823
97.0%
2025-01-08T17:53:10.718835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3034205
12.3%
i 2321115
 
9.4%
s 1756467
 
7.1%
e 1634157
 
6.6%
r 1566246
 
6.3%
o 1518962
 
6.2%
l 1442638
 
5.8%
t 1302012
 
5.3%
1299571
 
5.3%
u 1297622
 
5.3%
Other values (44) 7495114
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22068966
89.5%
Space Separator 1299571
 
5.3%
Uppercase Letter 1299571
 
5.3%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3034205
13.7%
i 2321115
10.5%
s 1756467
 
8.0%
e 1634157
 
7.4%
r 1566246
 
7.1%
o 1518962
 
6.9%
l 1442638
 
6.5%
t 1302012
 
5.9%
u 1297622
 
5.9%
n 1297510
 
5.9%
Other values (16) 4898032
22.2%
Uppercase Letter
ValueCountFrequency (%)
P 185835
14.3%
C 181674
14.0%
A 133784
10.3%
S 96420
 
7.4%
M 86379
 
6.6%
L 83596
 
6.4%
E 71837
 
5.5%
T 65730
 
5.1%
N 53880
 
4.1%
O 53677
 
4.1%
Other values (16) 286759
22.1%
Space Separator
ValueCountFrequency (%)
1299571
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23368537
94.7%
Common 1299572
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3034205
13.0%
i 2321115
 
9.9%
s 1756467
 
7.5%
e 1634157
 
7.0%
r 1566246
 
6.7%
o 1518962
 
6.5%
l 1442638
 
6.2%
t 1302012
 
5.6%
u 1297622
 
5.6%
n 1297510
 
5.6%
Other values (42) 6197603
26.5%
Common
ValueCountFrequency (%)
1299571
> 99.9%
- 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24668109
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3034205
12.3%
i 2321115
 
9.4%
s 1756467
 
7.1%
e 1634157
 
6.6%
r 1566246
 
6.3%
o 1518962
 
6.2%
l 1442638
 
5.8%
t 1302012
 
5.3%
1299571
 
5.3%
u 1297622
 
5.3%
Other values (44) 7495114
30.4%
Distinct94525
Distinct (%)4.9%
Missing2067
Missing (%)0.1%
Memory size14.7 MiB
2025-01-08T17:53:10.925732image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length188
Median length120
Mean length29.47398518
Min length6

Characters and Unicode

Total characters56717556
Distinct characters115
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27025 ?
Unique (%)1.4%

Sample

1st rowSycon Risso, 1827
2nd rowBulla striata Bruguière, 1792
3rd rowStylopathes columnaris (Duchassaing, 1870)
4th rowOphiothrix suensonii Lütken, 1856
5th rowNaria labrolineata (Gaskoin, 1849)
ValueCountFrequency (%)
137132
 
2.0%
linnaeus 102227
 
1.5%
1758 86436
 
1.3%
say 52030
 
0.8%
lamarck 41218
 
0.6%
dall 26280
 
0.4%
1791 25378
 
0.4%
gmelin 24581
 
0.4%
gastropoda 23786
 
0.4%
conus 22951
 
0.3%
Other values (67808) 6231685
92.0%
2025-01-08T17:53:11.200670image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4975492
 
8.8%
4849378
 
8.6%
i 3756798
 
6.6%
e 3416706
 
6.0%
r 2838443
 
5.0%
s 2680069
 
4.7%
o 2507220
 
4.4%
l 2493288
 
4.4%
n 2462018
 
4.3%
t 1946947
 
3.4%
Other values (105) 24791197
43.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37642616
66.4%
Decimal Number 6282608
 
11.1%
Space Separator 4849378
 
8.6%
Uppercase Letter 4140323
 
7.3%
Other Punctuation 2120049
 
3.7%
Close Punctuation 828480
 
1.5%
Open Punctuation 828480
 
1.5%
Dash Punctuation 25622
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4975492
13.2%
i 3756798
10.0%
e 3416706
 
9.1%
r 2838443
 
7.5%
s 2680069
 
7.1%
o 2507220
 
6.7%
l 2493288
 
6.6%
n 2462018
 
6.5%
t 1946947
 
5.2%
u 1878146
 
5.0%
Other values (50) 8687489
23.1%
Uppercase Letter
ValueCountFrequency (%)
S 395646
 
9.6%
P 384763
 
9.3%
C 374392
 
9.0%
L 361606
 
8.7%
A 294332
 
7.1%
M 289981
 
7.0%
B 240607
 
5.8%
H 228739
 
5.5%
G 219867
 
5.3%
E 172880
 
4.2%
Other values (27) 1177510
28.4%
Decimal Number
ValueCountFrequency (%)
1 1879993
29.9%
8 1317965
21.0%
9 685308
 
10.9%
7 543716
 
8.7%
5 369738
 
5.9%
2 318125
 
5.1%
6 311114
 
5.0%
0 300914
 
4.8%
4 287040
 
4.6%
3 268695
 
4.3%
Other Punctuation
ValueCountFrequency (%)
, 1587604
74.9%
. 385539
 
18.2%
& 137136
 
6.5%
' 9770
 
0.5%
Space Separator
ValueCountFrequency (%)
4849378
100.0%
Close Punctuation
ValueCountFrequency (%)
) 828480
100.0%
Open Punctuation
ValueCountFrequency (%)
( 828480
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25622
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41782939
73.7%
Common 14934617
 
26.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4975492
 
11.9%
i 3756798
 
9.0%
e 3416706
 
8.2%
r 2838443
 
6.8%
s 2680069
 
6.4%
o 2507220
 
6.0%
l 2493288
 
6.0%
n 2462018
 
5.9%
t 1946947
 
4.7%
u 1878146
 
4.5%
Other values (87) 12827812
30.7%
Common
ValueCountFrequency (%)
4849378
32.5%
1 1879993
 
12.6%
, 1587604
 
10.6%
8 1317965
 
8.8%
) 828480
 
5.5%
( 828480
 
5.5%
9 685308
 
4.6%
7 543716
 
3.6%
. 385539
 
2.6%
5 369738
 
2.5%
Other values (8) 1658416
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56587562
99.8%
None 129994
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4975492
 
8.8%
4849378
 
8.6%
i 3756798
 
6.6%
e 3416706
 
6.0%
r 2838443
 
5.0%
s 2680069
 
4.7%
o 2507220
 
4.4%
l 2493288
 
4.4%
n 2462018
 
4.4%
t 1946947
 
3.4%
Other values (60) 24661203
43.6%
None
ValueCountFrequency (%)
ü 34699
26.7%
ö 25701
19.8%
è 22198
17.1%
é 21881
16.8%
ø 9165
 
7.1%
å 5072
 
3.9%
Ö 4790
 
3.7%
á 1982
 
1.5%
ä 1161
 
0.9%
í 1065
 
0.8%
Other values (35) 2280
 
1.8%
Distinct133993
Distinct (%)8.5%
Missing353775
Missing (%)18.4%
Memory size14.7 MiB
2025-01-08T17:53:11.408148image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length85
Median length59
Mean length19.44688666
Min length4

Characters and Unicode

Total characters30582524
Distinct characters78
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51619 ?
Unique (%)3.3%

Sample

1st rowScypha sp.
2nd rowBulla striata
3rd rowStylopathes columnaris
4th rowOphiothrix suensonii
5th rowCypraea labrolineata
ValueCountFrequency (%)
sp 198063
 
6.0%
conus 24328
 
0.7%
cypraea 15395
 
0.5%
cambarus 12003
 
0.4%
cerithium 9397
 
0.3%
orconectes 8683
 
0.3%
procambarus 8141
 
0.2%
nassarius 6728
 
0.2%
gracilis 6632
 
0.2%
terebra 5168
 
0.2%
Other values (70829) 3025211
91.1%
2025-01-08T17:53:11.696013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3610431
 
11.8%
i 2750408
 
9.0%
s 2277504
 
7.4%
e 1954344
 
6.4%
r 1901340
 
6.2%
o 1840596
 
6.0%
1747131
 
5.7%
l 1714269
 
5.6%
n 1541700
 
5.0%
t 1537214
 
5.0%
Other values (68) 9707587
31.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26724880
87.4%
Space Separator 1747131
 
5.7%
Uppercase Letter 1685395
 
5.5%
Other Punctuation 198792
 
0.7%
Open Punctuation 112865
 
0.4%
Close Punctuation 112865
 
0.4%
Decimal Number 468
 
< 0.1%
Dash Punctuation 110
 
< 0.1%
Math Symbol 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3610431
13.5%
i 2750408
10.3%
s 2277504
 
8.5%
e 1954344
 
7.3%
r 1901340
 
7.1%
o 1840596
 
6.9%
l 1714269
 
6.4%
n 1541700
 
5.8%
t 1537214
 
5.8%
u 1522243
 
5.7%
Other values (18) 6074831
22.7%
Uppercase Letter
ValueCountFrequency (%)
C 246724
14.6%
P 236762
14.0%
A 165937
9.8%
S 135252
 
8.0%
M 109812
 
6.5%
T 106747
 
6.3%
L 97706
 
5.8%
E 85779
 
5.1%
O 78988
 
4.7%
N 66257
 
3.9%
Other values (16) 355431
21.1%
Decimal Number
ValueCountFrequency (%)
1 156
33.3%
8 110
23.5%
4 58
 
12.4%
9 38
 
8.1%
6 27
 
5.8%
2 25
 
5.3%
5 19
 
4.1%
7 16
 
3.4%
0 12
 
2.6%
3 7
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 198575
99.9%
, 107
 
0.1%
" 60
 
< 0.1%
/ 29
 
< 0.1%
' 15
 
< 0.1%
& 3
 
< 0.1%
? 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 112864
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 112864
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1747131
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%
Math Symbol
ValueCountFrequency (%)
+ 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28410275
92.9%
Common 2172249
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3610431
12.7%
i 2750408
 
9.7%
s 2277504
 
8.0%
e 1954344
 
6.9%
r 1901340
 
6.7%
o 1840596
 
6.5%
l 1714269
 
6.0%
n 1541700
 
5.4%
t 1537214
 
5.4%
u 1522243
 
5.4%
Other values (44) 7760226
27.3%
Common
ValueCountFrequency (%)
1747131
80.4%
. 198575
 
9.1%
( 112864
 
5.2%
) 112864
 
5.2%
1 156
 
< 0.1%
8 110
 
< 0.1%
- 110
 
< 0.1%
, 107
 
< 0.1%
" 60
 
< 0.1%
4 58
 
< 0.1%
Other values (14) 214
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30582508
> 99.9%
None 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3610431
 
11.8%
i 2750408
 
9.0%
s 2277504
 
7.4%
e 1954344
 
6.4%
r 1901340
 
6.2%
o 1840596
 
6.0%
1747131
 
5.7%
l 1714269
 
5.6%
n 1541700
 
5.0%
t 1537214
 
5.0%
Other values (66) 9707571
31.7%
None
ValueCountFrequency (%)
ü 15
93.8%
æ 1
 
6.2%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:11.747473image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters5779161
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 1926387
100.0%
2025-01-08T17:53:11.833502image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1926387
33.3%
M 1926387
33.3%
L 1926387
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5779161
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1926387
33.3%
M 1926387
33.3%
L 1926387
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 5779161
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1926387
33.3%
M 1926387
33.3%
L 1926387
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5779161
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1926387
33.3%
M 1926387
33.3%
L 1926387
33.3%
Distinct209952
Distinct (%)10.9%
Missing2
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:11.988152image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99588194
Min length7

Characters and Unicode

Total characters46225451
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9127 ?
Unique (%)0.5%

Sample

1st row2024-12-02T13:57:44.311Z
2nd row2024-12-02T13:57:20.485Z
3rd row2024-12-02T13:57:18.447Z
4th row2024-12-02T13:57:45.124Z
5th row2024-12-02T13:57:20.489Z
ValueCountFrequency (%)
2024-12-02t13:57:28.783z 37
 
< 0.1%
2024-12-02t13:57:52.889z 37
 
< 0.1%
2024-12-02t13:57:43.700z 36
 
< 0.1%
2024-12-02t13:58:01.714z 36
 
< 0.1%
2024-12-02t13:57:40.815z 36
 
< 0.1%
2024-12-02t13:57:30.406z 35
 
< 0.1%
2024-12-02t13:57:53.093z 35
 
< 0.1%
2024-12-02t13:57:41.994z 35
 
< 0.1%
2024-12-02t13:57:40.927z 35
 
< 0.1%
2024-12-02t13:57:35.574z 35
 
< 0.1%
Other values (209942) 1926034
> 99.9%
2025-01-08T17:53:12.208788image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 8796773
19.0%
0 4884695
10.6%
1 4858658
10.5%
: 3852774
8.3%
- 3852774
8.3%
4 3098095
 
6.7%
5 3058765
 
6.6%
3 3051121
 
6.6%
T 1926388
 
4.2%
Z 1926387
 
4.2%
Other values (24) 6919021
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 32742672
70.8%
Other Punctuation 5777192
 
12.5%
Uppercase Letter 3852785
 
8.3%
Dash Punctuation 3852774
 
8.3%
Lowercase Letter 28
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
14.3%
r 4
14.3%
h 4
14.3%
n 3
10.7%
i 2
7.1%
c 2
7.1%
y 2
7.1%
u 2
7.1%
o 1
 
3.6%
s 1
 
3.6%
Other values (3) 3
10.7%
Decimal Number
ValueCountFrequency (%)
2 8796773
26.9%
0 4884695
14.9%
1 4858658
14.8%
4 3098095
 
9.5%
5 3058765
 
9.3%
3 3051121
 
9.3%
7 1479601
 
4.5%
9 1231841
 
3.8%
6 1162885
 
3.6%
8 1120238
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
T 1926388
50.0%
Z 1926387
50.0%
E 3
 
< 0.1%
S 2
 
< 0.1%
C 2
 
< 0.1%
A 1
 
< 0.1%
P 1
 
< 0.1%
D 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 3852774
66.7%
. 1924418
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 3852774
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42372638
91.7%
Latin 3852813
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 1926388
50.0%
Z 1926387
50.0%
a 4
 
< 0.1%
r 4
 
< 0.1%
h 4
 
< 0.1%
n 3
 
< 0.1%
E 3
 
< 0.1%
i 2
 
< 0.1%
c 2
 
< 0.1%
y 2
 
< 0.1%
Other values (11) 14
 
< 0.1%
Common
ValueCountFrequency (%)
2 8796773
20.8%
0 4884695
11.5%
1 4858658
11.5%
: 3852774
9.1%
- 3852774
9.1%
4 3098095
 
7.3%
5 3058765
 
7.2%
3 3051121
 
7.2%
. 1924418
 
4.5%
7 1479601
 
3.5%
Other values (3) 3514964
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46225451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 8796773
19.0%
0 4884695
10.6%
1 4858658
10.5%
: 3852774
8.3%
- 3852774
8.3%
4 3098095
 
6.7%
5 3058765
 
6.6%
3 3051121
 
6.6%
T 1926388
 
4.2%
Z 1926387
 
4.2%
Other values (24) 6919021
15.0%
Distinct4
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:12.269272image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99997872
Min length7

Characters and Unicode

Total characters46233319
Distinct characters27
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 1926387
> 99.9%
echinorhynchus 1
 
< 0.1%
setaria 1
 
< 0.1%
sphyranura 1
 
< 0.1%
2025-01-08T17:53:12.371805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 9631935
20.8%
1 7705548
16.7%
4 5779161
12.5%
- 3852774
 
8.3%
0 3852774
 
8.3%
: 3852774
 
8.3%
6 1926387
 
4.2%
Z 1926387
 
4.2%
. 1926387
 
4.2%
3 1926387
 
4.2%
Other values (17) 3852805
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 32748579
70.8%
Other Punctuation 5779161
 
12.5%
Uppercase Letter 3852777
 
8.3%
Dash Punctuation 3852774
 
8.3%
Lowercase Letter 28
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
14.3%
h 4
14.3%
r 4
14.3%
n 3
10.7%
y 2
7.1%
u 2
7.1%
c 2
7.1%
i 2
7.1%
o 1
 
3.6%
s 1
 
3.6%
Other values (3) 3
10.7%
Decimal Number
ValueCountFrequency (%)
2 9631935
29.4%
1 7705548
23.5%
4 5779161
17.6%
0 3852774
 
11.8%
6 1926387
 
5.9%
3 1926387
 
5.9%
8 1926387
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
Z 1926387
50.0%
T 1926387
50.0%
S 2
 
< 0.1%
E 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
: 3852774
66.7%
. 1926387
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 3852774
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42380514
91.7%
Latin 3852805
 
8.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
Z 1926387
50.0%
T 1926387
50.0%
a 4
 
< 0.1%
h 4
 
< 0.1%
r 4
 
< 0.1%
n 3
 
< 0.1%
y 2
 
< 0.1%
S 2
 
< 0.1%
u 2
 
< 0.1%
c 2
 
< 0.1%
Other values (7) 8
 
< 0.1%
Common
ValueCountFrequency (%)
2 9631935
22.7%
1 7705548
18.2%
4 5779161
13.6%
- 3852774
 
9.1%
0 3852774
 
9.1%
: 3852774
 
9.1%
6 1926387
 
4.5%
. 1926387
 
4.5%
3 1926387
 
4.5%
8 1926387
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46233319
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 9631935
20.8%
1 7705548
16.7%
4 5779161
12.5%
- 3852774
 
8.3%
0 3852774
 
8.3%
: 3852774
 
8.3%
6 1926387
 
4.2%
Z 1926387
 
4.2%
. 1926387
 
4.2%
3 1926387
 
4.2%
Other values (17) 3852805
8.3%

repatriated
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing110144
Missing (%)5.7%
Memory size14.7 MiB
2025-01-08T17:53:12.414805image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.47822738
Min length4

Characters and Unicode

Total characters8133576
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowtrue
4th rowfalse
5th rowtrue
ValueCountFrequency (%)
true 947669
52.2%
false 868580
47.8%
2025-01-08T17:53:12.504720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1816249
22.3%
t 947669
11.7%
r 947669
11.7%
u 947669
11.7%
f 868580
10.7%
a 868580
10.7%
l 868580
10.7%
s 868580
10.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8133576
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1816249
22.3%
t 947669
11.7%
r 947669
11.7%
u 947669
11.7%
f 868580
10.7%
a 868580
10.7%
l 868580
10.7%
s 868580
10.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 8133576
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1816249
22.3%
t 947669
11.7%
r 947669
11.7%
u 947669
11.7%
f 868580
10.7%
a 868580
10.7%
l 868580
10.7%
s 868580
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8133576
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1816249
22.3%
t 947669
11.7%
r 947669
11.7%
u 947669
11.7%
f 868580
10.7%
a 868580
10.7%
l 868580
10.7%
s 868580
10.7%

relativeOrganismQuantity
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926392
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:12.559099image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters36
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 1
100.0%
2025-01-08T17:53:12.654808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 4
11.1%
a 4
11.1%
- 4
11.1%
2 3
8.3%
b 3
8.3%
4 3
8.3%
8 2
 
5.6%
3 2
 
5.6%
5 2
 
5.6%
9 2
 
5.6%
Other values (6) 7
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18
50.0%
Lowercase Letter 14
38.9%
Dash Punctuation 4
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
16.7%
4 3
16.7%
8 2
11.1%
3 2
11.1%
5 2
11.1%
9 2
11.1%
1 1
 
5.6%
7 1
 
5.6%
0 1
 
5.6%
6 1
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22
61.1%
Latin 14
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 4
18.2%
2 3
13.6%
4 3
13.6%
8 2
9.1%
3 2
9.1%
5 2
9.1%
9 2
9.1%
1 1
 
4.5%
7 1
 
4.5%
0 1
 
4.5%
Latin
ValueCountFrequency (%)
c 4
28.6%
a 4
28.6%
b 3
21.4%
d 2
14.3%
e 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 4
11.1%
a 4
11.1%
- 4
11.1%
2 3
8.3%
b 3
8.3%
4 3
8.3%
8 2
 
5.6%
3 2
 
5.6%
5 2
 
5.6%
9 2
 
5.6%
Other values (6) 7
19.4%

projectId
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing1926390
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-08T17:53:12.701917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length12
Mean length10
Min length2

Characters and Unicode

Total characters30
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowlageniformis
2nd rowUS
3rd rowlabiatopapillosa
ValueCountFrequency (%)
lageniformis 1
33.3%
us 1
33.3%
labiatopapillosa 1
33.3%
2025-01-08T17:53:12.803343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5
16.7%
l 4
13.3%
i 4
13.3%
o 3
10.0%
s 2
 
6.7%
p 2
 
6.7%
g 1
 
3.3%
e 1
 
3.3%
n 1
 
3.3%
f 1
 
3.3%
Other values (6) 6
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28
93.3%
Uppercase Letter 2
 
6.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5
17.9%
l 4
14.3%
i 4
14.3%
o 3
10.7%
s 2
 
7.1%
p 2
 
7.1%
g 1
 
3.6%
e 1
 
3.6%
n 1
 
3.6%
f 1
 
3.6%
Other values (4) 4
14.3%
Uppercase Letter
ValueCountFrequency (%)
U 1
50.0%
S 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5
16.7%
l 4
13.3%
i 4
13.3%
o 3
10.0%
s 2
 
6.7%
p 2
 
6.7%
g 1
 
3.3%
e 1
 
3.3%
n 1
 
3.3%
f 1
 
3.3%
Other values (6) 6
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5
16.7%
l 4
13.3%
i 4
13.3%
o 3
10.0%
s 2
 
6.7%
p 2
 
6.7%
g 1
 
3.3%
e 1
 
3.3%
n 1
 
3.3%
f 1
 
3.3%
Other values (6) 6
20.0%
Distinct3
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:12.845899image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length5
Mean length4.997351001
Min length4

Characters and Unicode

Total characters9626837
Distinct characters21
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 1921265
99.7%
true 5122
 
0.3%
2024-12-02t13:57:24.316z 1
 
< 0.1%
2025-01-08T17:53:12.947402image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1926387
20.0%
f 1921265
20.0%
l 1921265
20.0%
s 1921265
20.0%
a 1921265
20.0%
t 5122
 
0.1%
r 5122
 
0.1%
u 5122
 
0.1%
2 5
 
< 0.1%
1 3
 
< 0.1%
Other values (11) 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9626813
> 99.9%
Decimal Number 17
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1926387
20.0%
f 1921265
20.0%
l 1921265
20.0%
s 1921265
20.0%
a 1921265
20.0%
t 5122
 
0.1%
r 5122
 
0.1%
u 5122
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 5
29.4%
1 3
17.6%
3 2
 
11.8%
4 2
 
11.8%
0 2
 
11.8%
5 1
 
5.9%
7 1
 
5.9%
6 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
Z 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9626815
> 99.9%
Common 22
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5
22.7%
1 3
13.6%
: 2
 
9.1%
3 2
 
9.1%
4 2
 
9.1%
- 2
 
9.1%
0 2
 
9.1%
5 1
 
4.5%
7 1
 
4.5%
. 1
 
4.5%
Latin
ValueCountFrequency (%)
e 1926387
20.0%
f 1921265
20.0%
l 1921265
20.0%
s 1921265
20.0%
a 1921265
20.0%
t 5122
 
0.1%
r 5122
 
0.1%
u 5122
 
0.1%
T 1
 
< 0.1%
Z 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9626837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1926387
20.0%
f 1921265
20.0%
l 1921265
20.0%
s 1921265
20.0%
a 1921265
20.0%
t 5122
 
0.1%
r 5122
 
0.1%
u 5122
 
0.1%
2 5
 
< 0.1%
1 3
 
< 0.1%
Other values (11) 16
 
< 0.1%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing115678
Missing (%)6.0%
Memory size14.7 MiB
2025-01-08T17:53:12.996402image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length10.88896817
Min length4

Characters and Unicode

Total characters19716818
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowLATIN_AMERICA
4th rowNORTH_AMERICA
5th rowASIA
ValueCountFrequency (%)
north_america 900416
49.7%
latin_america 368762
20.4%
asia 206888
 
11.4%
oceania 167374
 
9.2%
africa 56930
 
3.1%
europe 56674
 
3.1%
antarctica 53671
 
3.0%
2025-01-08T17:53:13.096285image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 3930515
19.9%
R 2336869
11.9%
I 2122803
10.8%
C 1600824
8.1%
E 1549900
 
7.9%
N 1490223
 
7.6%
T 1376520
 
7.0%
_ 1269178
 
6.4%
M 1269178
 
6.4%
O 1124464
 
5.7%
Other values (6) 1646344
8.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 18447640
93.6%
Connector Punctuation 1269178
 
6.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 3930515
21.3%
R 2336869
12.7%
I 2122803
11.5%
C 1600824
8.7%
E 1549900
 
8.4%
N 1490223
 
8.1%
T 1376520
 
7.5%
M 1269178
 
6.9%
O 1124464
 
6.1%
H 900416
 
4.9%
Other values (5) 745928
 
4.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1269178
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18447640
93.6%
Common 1269178
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 3930515
21.3%
R 2336869
12.7%
I 2122803
11.5%
C 1600824
8.7%
E 1549900
 
8.4%
N 1490223
 
8.1%
T 1376520
 
7.5%
M 1269178
 
6.9%
O 1124464
 
6.1%
H 900416
 
4.9%
Other values (5) 745928
 
4.0%
Common
ValueCountFrequency (%)
_ 1269178
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19716818
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 3930515
19.9%
R 2336869
11.9%
I 2122803
10.8%
C 1600824
8.1%
E 1549900
 
7.9%
N 1490223
 
7.6%
T 1376520
 
7.0%
_ 1269178
 
6.4%
M 1269178
 
6.4%
O 1124464
 
5.7%
Other values (6) 1646344
8.3%
Distinct3
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-08T17:53:13.141759image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.99998962
Min length5

Characters and Unicode

Total characters25043050
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 1926387
> 99.9%
species 2
 
< 0.1%
genus 1
 
< 0.1%
2025-01-08T17:53:13.247397image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 3852774
15.4%
A 3852774
15.4%
E 1926392
7.7%
I 1926389
7.7%
C 1926389
7.7%
N 1926388
7.7%
O 1926387
7.7%
T 1926387
7.7%
H 1926387
7.7%
_ 1926387
7.7%
Other values (5) 1926396
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 23116663
92.3%
Connector Punctuation 1926387
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 3852774
16.7%
A 3852774
16.7%
E 1926392
8.3%
I 1926389
8.3%
C 1926389
8.3%
N 1926388
8.3%
O 1926387
8.3%
T 1926387
8.3%
H 1926387
8.3%
M 1926387
8.3%
Other values (4) 9
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 1926387
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23116663
92.3%
Common 1926387
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 3852774
16.7%
A 3852774
16.7%
E 1926392
8.3%
I 1926389
8.3%
C 1926389
8.3%
N 1926388
8.3%
O 1926387
8.3%
T 1926387
8.3%
H 1926387
8.3%
M 1926387
8.3%
Other values (4) 9
 
< 0.1%
Common
ValueCountFrequency (%)
_ 1926387
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25043050
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 3852774
15.4%
A 3852774
15.4%
E 1926392
7.7%
I 1926389
7.7%
C 1926389
7.7%
N 1926388
7.7%
O 1926387
7.7%
T 1926387
7.7%
H 1926387
7.7%
_ 1926387
7.7%
Other values (5) 1926396
7.7%

level0Gid
Text

Missing 

Distinct226
Distinct (%)0.1%
Missing1691070
Missing (%)87.8%
Memory size14.7 MiB
2025-01-08T17:53:13.403404image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters705969
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)< 0.1%

Sample

1st rowUSA
2nd rowPAN
3rd rowUSA
4th rowUSA
5th rowPAN
ValueCountFrequency (%)
usa 138756
59.0%
pan 11701
 
5.0%
jpn 8794
 
3.7%
mex 4690
 
2.0%
phl 4467
 
1.9%
can 4382
 
1.9%
dom 3446
 
1.5%
cri 3146
 
1.3%
mdg 2984
 
1.3%
pri 2846
 
1.2%
Other values (216) 50111
 
21.3%
2025-01-08T17:53:13.615264image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 169536
24.0%
U 149988
21.2%
S 147144
20.8%
N 36249
 
5.1%
P 32498
 
4.6%
M 17314
 
2.5%
C 16563
 
2.3%
R 16511
 
2.3%
I 11617
 
1.6%
J 11408
 
1.6%
Other values (18) 97141
13.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 705967
> 99.9%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 169536
24.0%
U 149988
21.2%
S 147144
20.8%
N 36249
 
5.1%
P 32498
 
4.6%
M 17314
 
2.5%
C 16563
 
2.3%
R 16511
 
2.3%
I 11617
 
1.6%
J 11408
 
1.6%
Other values (16) 97139
13.8%
Decimal Number
ValueCountFrequency (%)
0 1
50.0%
6 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 705967
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 169536
24.0%
U 149988
21.2%
S 147144
20.8%
N 36249
 
5.1%
P 32498
 
4.6%
M 17314
 
2.5%
C 16563
 
2.3%
R 16511
 
2.3%
I 11617
 
1.6%
J 11408
 
1.6%
Other values (16) 97139
13.8%
Common
ValueCountFrequency (%)
0 1
50.0%
6 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 705969
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 169536
24.0%
U 149988
21.2%
S 147144
20.8%
N 36249
 
5.1%
P 32498
 
4.6%
M 17314
 
2.5%
C 16563
 
2.3%
R 16511
 
2.3%
I 11617
 
1.6%
J 11408
 
1.6%
Other values (18) 97141
13.8%

level0Name
Text

Missing 

Distinct226
Distinct (%)0.1%
Missing1691070
Missing (%)87.8%
Memory size14.7 MiB
2025-01-08T17:53:13.785957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length13
Mean length11.1625043
Min length4

Characters and Unicode

Total characters2626794
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowPanama
3rd rowUnited States
4th rowUnited States
5th rowPanama
ValueCountFrequency (%)
united 139310
34.8%
states 138840
34.7%
panama 11701
 
2.9%
japan 8794
 
2.2%
méxico 4690
 
1.2%
philippines 4467
 
1.1%
canada 4382
 
1.1%
republic 3662
 
0.9%
dominican 3446
 
0.9%
costa 3146
 
0.8%
Other values (265) 77445
19.4%
2025-01-08T17:53:14.021714image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 437054
16.6%
e 318254
12.1%
a 296155
11.3%
i 217274
8.3%
n 208865
8.0%
s 170178
 
6.5%
164560
 
6.3%
d 159630
 
6.1%
S 144075
 
5.5%
U 140430
 
5.3%
Other values (52) 370319
14.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2060448
78.4%
Uppercase Letter 398595
 
15.2%
Space Separator 164560
 
6.3%
Other Punctuation 3038
 
0.1%
Close Punctuation 67
 
< 0.1%
Open Punctuation 67
 
< 0.1%
Dash Punctuation 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 437054
21.2%
e 318254
15.4%
a 296155
14.4%
i 217274
10.5%
n 208865
10.1%
s 170178
 
8.3%
d 159630
 
7.7%
o 34362
 
1.7%
c 31190
 
1.5%
r 29949
 
1.5%
Other values (21) 157537
 
7.6%
Uppercase Letter
ValueCountFrequency (%)
S 144075
36.1%
U 140430
35.2%
P 22372
 
5.6%
C 16711
 
4.2%
J 10574
 
2.7%
R 10283
 
2.6%
M 9896
 
2.5%
A 6942
 
1.7%
B 6541
 
1.6%
T 6469
 
1.6%
Other values (14) 24302
 
6.1%
Other Punctuation
ValueCountFrequency (%)
. 1616
53.2%
, 1418
46.7%
' 4
 
0.1%
Space Separator
ValueCountFrequency (%)
164560
100.0%
Close Punctuation
ValueCountFrequency (%)
) 67
100.0%
Open Punctuation
ValueCountFrequency (%)
( 67
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2459043
93.6%
Common 167751
 
6.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 437054
17.8%
e 318254
12.9%
a 296155
12.0%
i 217274
8.8%
n 208865
8.5%
s 170178
 
6.9%
d 159630
 
6.5%
S 144075
 
5.9%
U 140430
 
5.7%
o 34362
 
1.4%
Other values (45) 332766
13.5%
Common
ValueCountFrequency (%)
164560
98.1%
. 1616
 
1.0%
, 1418
 
0.8%
) 67
 
< 0.1%
( 67
 
< 0.1%
- 19
 
< 0.1%
' 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2620383
99.8%
None 6411
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 437054
16.7%
e 318254
12.1%
a 296155
11.3%
i 217274
8.3%
n 208865
8.0%
s 170178
 
6.5%
164560
 
6.3%
d 159630
 
6.1%
S 144075
 
5.5%
U 140430
 
5.4%
Other values (47) 363908
13.9%
None
ValueCountFrequency (%)
é 4700
73.3%
ç 1697
 
26.5%
ã 5
 
0.1%
í 5
 
0.1%
ô 4
 
0.1%

level1Gid
Text

Missing 

Distinct1804
Distinct (%)0.8%
Missing1694638
Missing (%)88.0%
Memory size14.7 MiB
2025-01-08T17:53:14.213432image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.672701776
Min length6

Characters and Unicode

Total characters1778187
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique305 ?
Unique (%)0.1%

Sample

1st rowUSA.10_1
2nd rowPAN.4_1
3rd rowUSA.14_1
4th rowUSA.16_1
5th rowPAN.12_1
ValueCountFrequency (%)
usa.10_1 18116
 
7.8%
usa.5_1 8182
 
3.5%
usa.43_1 8000
 
3.5%
pan.4_1 7933
 
3.4%
jpn.32_1 6827
 
2.9%
usa.47_1 6423
 
2.8%
usa.21_1 5755
 
2.5%
usa.44_1 5753
 
2.5%
usa.11_1 5094
 
2.2%
usa.9_1 4888
 
2.1%
Other values (1794) 154784
66.8%
2025-01-08T17:53:14.464988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 319208
18.0%
_ 231589
13.0%
. 231553
13.0%
A 166849
9.4%
U 148282
8.3%
S 146784
8.3%
2 66976
 
3.8%
4 61992
 
3.5%
3 50877
 
2.9%
N 36177
 
2.0%
Other values (28) 317900
17.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 695761
39.1%
Decimal Number 619284
34.8%
Connector Punctuation 231589
 
13.0%
Other Punctuation 231553
 
13.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 166849
24.0%
U 148282
21.3%
S 146784
21.1%
N 36177
 
5.2%
P 32438
 
4.7%
M 17257
 
2.5%
R 16460
 
2.4%
C 14782
 
2.1%
I 11490
 
1.7%
J 11408
 
1.6%
Other values (16) 93834
13.5%
Decimal Number
ValueCountFrequency (%)
1 319208
51.5%
2 66976
 
10.8%
4 61992
 
10.0%
3 50877
 
8.2%
5 30137
 
4.9%
0 26692
 
4.3%
9 18370
 
3.0%
7 17007
 
2.7%
6 15063
 
2.4%
8 12962
 
2.1%
Connector Punctuation
ValueCountFrequency (%)
_ 231589
100.0%
Other Punctuation
ValueCountFrequency (%)
. 231553
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1082426
60.9%
Latin 695761
39.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 166849
24.0%
U 148282
21.3%
S 146784
21.1%
N 36177
 
5.2%
P 32438
 
4.7%
M 17257
 
2.5%
R 16460
 
2.4%
C 14782
 
2.1%
I 11490
 
1.7%
J 11408
 
1.6%
Other values (16) 93834
13.5%
Common
ValueCountFrequency (%)
1 319208
29.5%
_ 231589
21.4%
. 231553
21.4%
2 66976
 
6.2%
4 61992
 
5.7%
3 50877
 
4.7%
5 30137
 
2.8%
0 26692
 
2.5%
9 18370
 
1.7%
7 17007
 
1.6%
Other values (2) 28025
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1778187
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 319208
18.0%
_ 231589
13.0%
. 231553
13.0%
A 166849
9.4%
U 148282
8.3%
S 146784
8.3%
2 66976
 
3.8%
4 61992
 
3.5%
3 50877
 
2.9%
N 36177
 
2.0%
Other values (28) 317900
17.9%

level1Name
Text

Missing 

Distinct1737
Distinct (%)0.7%
Missing1694634
Missing (%)88.0%
Memory size14.7 MiB
2025-01-08T17:53:14.661204image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length48
Median length30
Mean length8.96956321
Min length3

Characters and Unicode

Total characters2078777
Distinct characters115
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique296 ?
Unique (%)0.1%

Sample

1st rowFlorida
2nd rowColón
3rd rowIllinois
4th rowIowa
5th rowPanamá
ValueCountFrequency (%)
florida 18120
 
6.1%
california 9283
 
3.1%
carolina 8221
 
2.8%
tennessee 8000
 
2.7%
colón 7933
 
2.7%
virginia 7606
 
2.6%
okinawa 6827
 
2.3%
new 5902
 
2.0%
maryland 5759
 
1.9%
texas 5753
 
1.9%
Other values (1876) 212777
71.8%
2025-01-08T17:53:14.917301image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 294071
14.1%
i 195344
 
9.4%
n 167961
 
8.1%
o 145084
 
7.0%
r 120163
 
5.8%
s 119088
 
5.7%
e 117144
 
5.6%
l 98274
 
4.7%
t 78261
 
3.8%
64422
 
3.1%
Other values (105) 678965
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1721765
82.8%
Uppercase Letter 288487
 
13.9%
Space Separator 64422
 
3.1%
Dash Punctuation 2995
 
0.1%
Other Punctuation 1067
 
0.1%
Modifier Symbol 28
 
< 0.1%
Connector Punctuation 5
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 294071
17.1%
i 195344
11.3%
n 167961
9.8%
o 145084
8.4%
r 120163
 
7.0%
s 119088
 
6.9%
e 117144
 
6.8%
l 98274
 
5.7%
t 78261
 
4.5%
u 50903
 
3.0%
Other values (60) 335472
19.5%
Uppercase Letter
ValueCountFrequency (%)
C 41271
14.3%
M 28854
 
10.0%
T 22649
 
7.9%
A 21731
 
7.5%
S 20838
 
7.2%
F 20114
 
7.0%
N 18874
 
6.5%
O 15937
 
5.5%
V 10384
 
3.6%
I 9849
 
3.4%
Other values (24) 77986
27.0%
Other Punctuation
ValueCountFrequency (%)
' 910
85.3%
/ 61
 
5.7%
. 55
 
5.2%
! 25
 
2.3%
, 16
 
1.5%
Space Separator
ValueCountFrequency (%)
64422
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2995
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 28
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%
Close Punctuation
ValueCountFrequency (%)
] 4
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2010252
96.7%
Common 68525
 
3.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 294071
14.6%
i 195344
 
9.7%
n 167961
 
8.4%
o 145084
 
7.2%
r 120163
 
6.0%
s 119088
 
5.9%
e 117144
 
5.8%
l 98274
 
4.9%
t 78261
 
3.9%
u 50903
 
2.5%
Other values (94) 623959
31.0%
Common
ValueCountFrequency (%)
64422
94.0%
- 2995
 
4.4%
' 910
 
1.3%
/ 61
 
0.1%
. 55
 
0.1%
` 28
 
< 0.1%
! 25
 
< 0.1%
, 16
 
< 0.1%
_ 5
 
< 0.1%
] 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2053783
98.8%
None 24840
 
1.2%
Latin Ext Additional 154
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 294071
14.3%
i 195344
 
9.5%
n 167961
 
8.2%
o 145084
 
7.1%
r 120163
 
5.9%
s 119088
 
5.8%
e 117144
 
5.7%
l 98274
 
4.8%
t 78261
 
3.8%
64422
 
3.1%
Other values (53) 653971
31.8%
None
ValueCountFrequency (%)
ó 10786
43.4%
í 4492
18.1%
á 4303
 
17.3%
é 1522
 
6.1%
Î 1159
 
4.7%
ü 851
 
3.4%
ã 420
 
1.7%
ö 314
 
1.3%
à 159
 
0.6%
ñ 150
 
0.6%
Other values (31) 684
 
2.8%
Latin Ext Additional
ValueCountFrequency (%)
64
41.6%
45
29.2%
13
 
8.4%
11
 
7.1%
8
 
5.2%
5
 
3.2%
4
 
2.6%
1
 
0.6%
1
 
0.6%
1
 
0.6%

level2Gid
Text

Missing 

Distinct7611
Distinct (%)3.5%
Missing1708984
Missing (%)88.7%
Memory size14.7 MiB
2025-01-08T17:53:15.113919image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.36195374
Min length7

Characters and Unicode

Total characters2252782
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1730 ?
Unique (%)0.8%

Sample

1st rowUSA.10.59_1
2nd rowPAN.4.2_1
3rd rowUSA.14.18_1
4th rowUSA.16.3_1
5th rowPAN.12.2_1
ValueCountFrequency (%)
jpn.32.28_1 6059
 
2.8%
usa.10.43_1 6013
 
2.8%
pan.4.2_1 5746
 
2.6%
usa.9.1_1 4888
 
2.2%
usa.10.44_1 4299
 
2.0%
usa.22.1_1 3251
 
1.5%
mdg.2.1_1 2723
 
1.3%
dom.29.3_1 2676
 
1.2%
cri.5.2_1 2210
 
1.0%
pan.4.5_1 2107
 
1.0%
Other values (7601) 177437
81.6%
2025-01-08T17:53:15.366657image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 434450
19.3%
1 372514
16.5%
_ 217409
9.7%
A 164597
 
7.3%
U 146942
 
6.5%
S 144683
 
6.4%
2 131222
 
5.8%
4 103045
 
4.6%
3 93136
 
4.1%
5 60231
 
2.7%
Other values (28) 384553
17.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 948698
42.1%
Uppercase Letter 652225
29.0%
Other Punctuation 434450
19.3%
Connector Punctuation 217409
 
9.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 164597
25.2%
U 146942
22.5%
S 144683
22.2%
N 35543
 
5.4%
P 27322
 
4.2%
C 13617
 
2.1%
M 13373
 
2.1%
R 11680
 
1.8%
J 9629
 
1.5%
E 9601
 
1.5%
Other values (16) 75238
11.5%
Decimal Number
ValueCountFrequency (%)
1 372514
39.3%
2 131222
 
13.8%
4 103045
 
10.9%
3 93136
 
9.8%
5 60231
 
6.3%
0 42025
 
4.4%
6 40348
 
4.3%
7 37009
 
3.9%
8 36094
 
3.8%
9 33074
 
3.5%
Other Punctuation
ValueCountFrequency (%)
. 434450
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 217409
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1600557
71.0%
Latin 652225
29.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 164597
25.2%
U 146942
22.5%
S 144683
22.2%
N 35543
 
5.4%
P 27322
 
4.2%
C 13617
 
2.1%
M 13373
 
2.1%
R 11680
 
1.8%
J 9629
 
1.5%
E 9601
 
1.5%
Other values (16) 75238
11.5%
Common
ValueCountFrequency (%)
. 434450
27.1%
1 372514
23.3%
_ 217409
13.6%
2 131222
 
8.2%
4 103045
 
6.4%
3 93136
 
5.8%
5 60231
 
3.8%
0 42025
 
2.6%
6 40348
 
2.5%
7 37009
 
2.3%
Other values (2) 69168
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2252782
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 434450
19.3%
1 372514
16.5%
_ 217409
9.7%
A 164597
 
7.3%
U 146942
 
6.5%
S 144683
 
6.4%
2 131222
 
5.8%
4 103045
 
4.6%
3 93136
 
4.1%
5 60231
 
2.7%
Other values (28) 384553
17.1%

level2Name
Text

Missing 

Distinct6184
Distinct (%)2.8%
Missing1709049
Missing (%)88.7%
Memory size14.7 MiB
2025-01-08T17:53:15.561358image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length8.376522931
Min length1

Characters and Unicode

Total characters1820587
Distinct characters147
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1557 ?
Unique (%)0.7%

Sample

1st rowSeminole
2nd rowColón
3rd rowCumberland
4th rowAllamakee
5th rowChepo
ValueCountFrequency (%)
san 6246
 
2.3%
onna 6059
 
2.2%
miami-dade 6013
 
2.2%
colón 5755
 
2.1%
of 5128
 
1.9%
columbia 5068
 
1.8%
monroe 4935
 
1.8%
district 4903
 
1.8%
de 3904
 
1.4%
barnstable 3251
 
1.2%
Other values (6463) 224555
81.4%
2025-01-08T17:53:15.814173image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 214955
 
11.8%
n 155743
 
8.6%
e 142976
 
7.9%
o 137584
 
7.6%
i 116282
 
6.4%
r 99257
 
5.5%
l 83851
 
4.6%
t 80842
 
4.4%
s 71222
 
3.9%
58473
 
3.2%
Other values (137) 659402
36.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1471166
80.8%
Uppercase Letter 273185
 
15.0%
Space Separator 58473
 
3.2%
Dash Punctuation 10521
 
0.6%
Other Punctuation 5065
 
0.3%
Decimal Number 1906
 
0.1%
Open Punctuation 151
 
< 0.1%
Close Punctuation 116
 
< 0.1%
Modifier Symbol 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 214955
14.6%
n 155743
10.6%
e 142976
9.7%
o 137584
9.4%
i 116282
 
7.9%
r 99257
 
6.7%
l 83851
 
5.7%
t 80842
 
5.5%
s 71222
 
4.8%
u 53336
 
3.6%
Other values (75) 315118
21.4%
Uppercase Letter
ValueCountFrequency (%)
C 36023
13.2%
M 27375
 
10.0%
S 23695
 
8.7%
D 22216
 
8.1%
B 17416
 
6.4%
P 17006
 
6.2%
L 16649
 
6.1%
A 12629
 
4.6%
O 11173
 
4.1%
G 9385
 
3.4%
Other values (30) 79618
29.1%
Decimal Number
ValueCountFrequency (%)
1 963
50.5%
2 208
 
10.9%
5 181
 
9.5%
0 165
 
8.7%
9 109
 
5.7%
8 96
 
5.0%
7 61
 
3.2%
4 45
 
2.4%
6 44
 
2.3%
3 34
 
1.8%
Other Punctuation
ValueCountFrequency (%)
' 2527
49.9%
. 2332
46.0%
/ 177
 
3.5%
? 18
 
0.4%
, 7
 
0.1%
& 3
 
0.1%
# 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
58473
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10521
100.0%
Open Punctuation
ValueCountFrequency (%)
( 151
100.0%
Close Punctuation
ValueCountFrequency (%)
) 116
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1744351
95.8%
Common 76236
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 214955
 
12.3%
n 155743
 
8.9%
e 142976
 
8.2%
o 137584
 
7.9%
i 116282
 
6.7%
r 99257
 
5.7%
l 83851
 
4.8%
t 80842
 
4.6%
s 71222
 
4.1%
u 53336
 
3.1%
Other values (115) 588303
33.7%
Common
ValueCountFrequency (%)
58473
76.7%
- 10521
 
13.8%
' 2527
 
3.3%
. 2332
 
3.1%
1 963
 
1.3%
2 208
 
0.3%
5 181
 
0.2%
/ 177
 
0.2%
0 165
 
0.2%
( 151
 
0.2%
Other values (12) 538
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1803112
99.0%
None 17297
 
1.0%
Latin Ext Additional 178
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 214955
 
11.9%
n 155743
 
8.6%
e 142976
 
7.9%
o 137584
 
7.6%
i 116282
 
6.4%
r 99257
 
5.5%
l 83851
 
4.7%
t 80842
 
4.5%
s 71222
 
3.9%
58473
 
3.2%
Other values (64) 641927
35.6%
None
ValueCountFrequency (%)
ó 8847
51.1%
á 2728
 
15.8%
í 1979
 
11.4%
é 1235
 
7.1%
ñ 556
 
3.2%
ã 434
 
2.5%
ō 364
 
2.1%
ú 211
 
1.2%
à 157
 
0.9%
Ō 94
 
0.5%
Other values (46) 692
 
4.0%
Latin Ext Additional
ValueCountFrequency (%)
ế 56
31.5%
23
12.9%
19
 
10.7%
18
 
10.1%
12
 
6.7%
11
 
6.2%
11
 
6.2%
7
 
3.9%
5
 
2.8%
4
 
2.2%
Other values (7) 12
 
6.7%

level3Gid
Text

Missing 

Distinct3021
Distinct (%)7.6%
Missing1886622
Missing (%)97.9%
Memory size14.7 MiB
2025-01-08T17:53:15.996051image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length11
Mean length11.67596993
Min length5

Characters and Unicode

Total characters464365
Distinct characters44
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1211 ?
Unique (%)3.0%

Sample

1st rowPAN.4.2.6_1
2nd rowPAN.12.2.2_1
3rd rowMMR.4.2.6_1
4th rowPAN.12.1.4_1
5th rowCAN.9.20.18_1
ValueCountFrequency (%)
pan.4.2.4_1 3201
 
8.0%
mdg.2.1.5_1 2581
 
6.5%
pan.4.2.6_1 2281
 
5.7%
cri.5.2.1_1 2199
 
5.5%
pan.4.5.5_1 1729
 
4.3%
can.6.2.11_1 743
 
1.9%
pan.11.1.5_1 729
 
1.8%
phl.20.2.8_1 443
 
1.1%
phl.25.27.3_1 382
 
1.0%
pan.12.1.4_1 370
 
0.9%
Other values (3011) 25113
63.1%
2025-01-08T17:53:16.227496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 119301
25.7%
1 77737
16.7%
_ 39767
 
8.6%
2 30165
 
6.5%
4 20568
 
4.4%
N 19748
 
4.3%
A 19069
 
4.1%
P 17786
 
3.8%
5 17385
 
3.7%
C 11012
 
2.4%
Other values (34) 91827
19.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 185939
40.0%
Other Punctuation 119301
25.7%
Uppercase Letter 119299
25.7%
Connector Punctuation 39767
 
8.6%
Lowercase Letter 47
 
< 0.1%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 19748
16.6%
A 19069
16.0%
P 17786
14.9%
C 11012
9.2%
H 7837
 
6.6%
I 6226
 
5.2%
R 6127
 
5.1%
L 5932
 
5.0%
D 5171
 
4.3%
M 3707
 
3.1%
Other values (13) 16684
14.0%
Decimal Number
ValueCountFrequency (%)
1 77737
41.8%
2 30165
 
16.2%
4 20568
 
11.1%
5 17385
 
9.3%
3 10783
 
5.8%
6 8776
 
4.7%
8 5685
 
3.1%
9 5534
 
3.0%
7 5339
 
2.9%
0 3967
 
2.1%
Lowercase Letter
ValueCountFrequency (%)
a 13
27.7%
c 12
25.5%
b 9
19.1%
d 6
12.8%
e 4
 
8.5%
f 1
 
2.1%
l 1
 
2.1%
s 1
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 119301
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 39767
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 345019
74.3%
Latin 119346
 
25.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 19748
16.5%
A 19069
16.0%
P 17786
14.9%
C 11012
9.2%
H 7837
 
6.6%
I 6226
 
5.2%
R 6127
 
5.1%
L 5932
 
5.0%
D 5171
 
4.3%
M 3707
 
3.1%
Other values (21) 16731
14.0%
Common
ValueCountFrequency (%)
. 119301
34.6%
1 77737
22.5%
_ 39767
 
11.5%
2 30165
 
8.7%
4 20568
 
6.0%
5 17385
 
5.0%
3 10783
 
3.1%
6 8776
 
2.5%
8 5685
 
1.6%
9 5534
 
1.6%
Other values (3) 9318
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 464365
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 119301
25.7%
1 77737
16.7%
_ 39767
 
8.6%
2 30165
 
6.5%
4 20568
 
4.4%
N 19748
 
4.3%
A 19069
 
4.1%
P 17786
 
3.8%
5 17385
 
3.7%
C 11012
 
2.4%
Other values (34) 91827
19.8%

level3Name
Text

Missing 

Distinct2871
Distinct (%)7.4%
Missing1887342
Missing (%)98.0%
Memory size14.7 MiB
2025-01-08T17:53:16.408643image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length9.371411744
Min length2

Characters and Unicode

Total characters365963
Distinct characters125
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1139 ?
Unique (%)2.9%

Sample

1st rowCristóbal
2nd rowChepillo
3rd rowMyitkyina
4th rowPedro González
5th rowKenora, Unorganized
ValueCountFrequency (%)
cativá 3201
 
5.8%
nosibe 2581
 
4.7%
cristóbal 2281
 
4.1%
limon 2199
 
4.0%
portobelo 1729
 
3.1%
harbour 745
 
1.4%
sachs 743
 
1.3%
veracruz 729
 
1.3%
santa 615
 
1.1%
unorganized 585
 
1.1%
Other values (3192) 39692
72.0%
2025-01-08T17:53:16.652531image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 46333
 
12.7%
o 27885
 
7.6%
i 25003
 
6.8%
n 22399
 
6.1%
r 19868
 
5.4%
e 19376
 
5.3%
t 17849
 
4.9%
16049
 
4.4%
l 15536
 
4.2%
s 13794
 
3.8%
Other values (115) 141871
38.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 285283
78.0%
Uppercase Letter 53616
 
14.7%
Space Separator 16049
 
4.4%
Other Punctuation 3460
 
0.9%
Decimal Number 3386
 
0.9%
Open Punctuation 1574
 
0.4%
Dash Punctuation 1339
 
0.4%
Close Punctuation 1256
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 46333
16.2%
o 27885
9.8%
i 25003
 
8.8%
n 22399
 
7.9%
r 19868
 
7.0%
e 19376
 
6.8%
t 17849
 
6.3%
l 15536
 
5.4%
s 13794
 
4.8%
b 11070
 
3.9%
Other values (62) 66170
23.2%
Uppercase Letter
ValueCountFrequency (%)
C 9110
17.0%
N 4814
 
9.0%
P 4571
 
8.5%
L 4478
 
8.4%
S 4471
 
8.3%
B 4332
 
8.1%
T 2737
 
5.1%
M 2485
 
4.6%
A 2379
 
4.4%
H 1680
 
3.1%
Other values (23) 12559
23.4%
Decimal Number
ValueCountFrequency (%)
1 842
24.9%
5 814
24.0%
2 601
17.7%
3 231
 
6.8%
0 194
 
5.7%
4 191
 
5.6%
6 160
 
4.7%
7 129
 
3.8%
8 116
 
3.4%
9 108
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 2306
66.6%
, 1029
29.7%
' 100
 
2.9%
/ 22
 
0.6%
! 2
 
0.1%
* 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
16049
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1574
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1339
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1256
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 338899
92.6%
Common 27064
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 46333
 
13.7%
o 27885
 
8.2%
i 25003
 
7.4%
n 22399
 
6.6%
r 19868
 
5.9%
e 19376
 
5.7%
t 17849
 
5.3%
l 15536
 
4.6%
s 13794
 
4.1%
b 11070
 
3.3%
Other values (95) 119786
35.3%
Common
ValueCountFrequency (%)
16049
59.3%
. 2306
 
8.5%
( 1574
 
5.8%
- 1339
 
4.9%
) 1256
 
4.6%
, 1029
 
3.8%
1 842
 
3.1%
5 814
 
3.0%
2 601
 
2.2%
3 231
 
0.9%
Other values (10) 1023
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 357168
97.6%
None 8669
 
2.4%
Latin Ext Additional 126
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 46333
 
13.0%
o 27885
 
7.8%
i 25003
 
7.0%
n 22399
 
6.3%
r 19868
 
5.6%
e 19376
 
5.4%
t 17849
 
5.0%
16049
 
4.5%
l 15536
 
4.3%
s 13794
 
3.9%
Other values (62) 133076
37.3%
None
ValueCountFrequency (%)
á 4090
47.2%
ó 2564
29.6%
í 832
 
9.6%
é 289
 
3.3%
ñ 187
 
2.2%
à 157
 
1.8%
è 103
 
1.2%
â 99
 
1.1%
Đ 63
 
0.7%
ú 41
 
0.5%
Other values (26) 244
 
2.8%
Latin Ext Additional
ValueCountFrequency (%)
27
21.4%
25
19.8%
13
10.3%
11
8.7%
9
 
7.1%
8
 
6.3%
8
 
6.3%
6
 
4.8%
ế 5
 
4.0%
3
 
2.4%
Other values (7) 11
8.7%

iucnRedListCategory
Text

Missing 

Distinct13
Distinct (%)< 0.1%
Missing469562
Missing (%)24.4%
Memory size14.7 MiB
2025-01-08T17:53:16.710036image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length2
Mean length2.000048736
Min length2

Characters and Unicode

Total characters2913733
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowNE
2nd rowNE
3rd rowNE
4th rowNE
5th rowNE
ValueCountFrequency (%)
ne 1307916
89.8%
lc 117121
 
8.0%
dd 11259
 
0.8%
nt 6488
 
0.4%
vu 6192
 
0.4%
cr 3404
 
0.2%
en 3150
 
0.2%
ex 1118
 
0.1%
ew 179
 
< 0.1%
2024-12-02t13:57:06.570z 1
 
< 0.1%
Other values (3) 3
 
< 0.1%
2025-01-08T17:53:16.804479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1317554
45.2%
E 1312363
45.0%
C 120525
 
4.1%
L 117121
 
4.0%
D 22518
 
0.8%
T 6491
 
0.2%
V 6192
 
0.2%
U 6192
 
0.2%
R 3404
 
0.1%
X 1118
 
< 0.1%
Other values (15) 255
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2913660
> 99.9%
Decimal Number 58
 
< 0.1%
Other Punctuation 9
 
< 0.1%
Dash Punctuation 6
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 1317554
45.2%
E 1312363
45.0%
C 120525
 
4.1%
L 117121
 
4.0%
D 22518
 
0.8%
T 6491
 
0.2%
V 6192
 
0.2%
U 6192
 
0.2%
R 3404
 
0.1%
X 1118
 
< 0.1%
Other values (2) 182
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 13
22.4%
1 9
15.5%
0 9
15.5%
5 8
13.8%
7 6
10.3%
4 4
 
6.9%
3 3
 
5.2%
9 3
 
5.2%
6 2
 
3.4%
8 1
 
1.7%
Other Punctuation
ValueCountFrequency (%)
: 6
66.7%
. 3
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2913660
> 99.9%
Common 73
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
2 13
17.8%
1 9
12.3%
0 9
12.3%
5 8
11.0%
- 6
8.2%
: 6
8.2%
7 6
8.2%
4 4
 
5.5%
3 3
 
4.1%
. 3
 
4.1%
Other values (3) 6
8.2%
Latin
ValueCountFrequency (%)
N 1317554
45.2%
E 1312363
45.0%
C 120525
 
4.1%
L 117121
 
4.0%
D 22518
 
0.8%
T 6491
 
0.2%
V 6192
 
0.2%
U 6192
 
0.2%
R 3404
 
0.1%
X 1118
 
< 0.1%
Other values (2) 182
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2913733
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1317554
45.2%
E 1312363
45.0%
C 120525
 
4.1%
L 117121
 
4.0%
D 22518
 
0.8%
T 6491
 
0.2%
V 6192
 
0.2%
U 6192
 
0.2%
R 3404
 
0.1%
X 1118
 
< 0.1%
Other values (15) 255
 
< 0.1%